`# V1

output the graph basic
graph output prettier
combine http and https backends
add --max-depth
detect service name from /etc/chef/client.rb
add nsq graph
add --skip-nsq-topics
detect defunct children via haproxy stats
skip display defunct children
haproxy 1.6 and 1.8 compatability
add error state for "null" NSQ clients (use "?")
detect missing stats socket haproxy
task: validate missing haproxy stats socket config is live manually with knife ssh
task: validate missing haproxy stats socket is live manually with knife ssh
add "no consumer" NSQ detection
task: validate DEFUNCT-ness
multiple seeds

V2.async

use asyncssh (10x faster!)
remove return_exceptions=True
use global BASTION connection (ronf/asyncssh#270)
limit concurrency w/ semaphore
split to modules
re-use ssh connection for get-name/get-config calls
pass lightweight node-ref through async calls instead of node dict
remove pending node print
deal with formatting/output-ordering implications
convert recursive crawl from await to ensure_future
improve live output rendering
fix introduced parent['last_sibling'] bug
bug: cycle is correct in the tree, but rendering zombie children (only for first level cycles?)
retry ssh connection 3 times, fine tune concurrency
introduced: --output=stdout is now broken due to render_node_live
rename water to water_spout, private module function
consolidate find..children error checking
validate frontend-router
move connection semaphore to ssh_layer
better trace/debug log levels
consolidate nsq node relationships w/ multiple connections
deal w/ SSH config: bastion & username
refactors from PR review (reduce complexity, procedural styling)

V2.features

DISPLAY: output in json
DISPLAY: load json file
DISPLAY: output in graphviz
DISPLAY: graphviz source
CRAWL: detect proxysql
CRAWL: cassandra
CRAWL: detect well known ports w/ netstat & AWS name lookup (cx, memcache, redis)
CRAWL: detect postgres well known port - causing trouble w/ name lookup
CRAWL: user defined links
move hints/skips to web.yaml
keep config.yaml
CRAWL: kinesis

V2.refactor

move grouping of nsq topics to application layer, on service_name instead of IP
config_errors -> warnings, crawl_errors -> errors
refactor ssh config to ssh config file
refactor --hide-defunct to --skip-defunct and do not even (crawl)
graphviz warn/error color coding
remove "cruft" handling
add quick filter to rewrite service_name mysql-main-port_3306 to mysql-main-r/o
create objects or named tuples (dataclasses!)
PEP8, 120 line length
CHARLOTTE: make the get_config function into configurable parsers definable in YAML
charlotte: replace 'null' response from NSQ for missing IP w/ actual None response
charlotte: move crawl strategy exceptions (frontend-router) into charlotte
charlotte: move blocking logic to charlotte
charlotte: rename crawl_strategy -> crawl_provider on Node()
charlotte: move service_name_rewrite to charlotte
rename protocol_detail -> protocol_mux
CHARLOTTE: --skip-{name} arguments
--skip-defunct -> --hide-defunct
refactor database named matching to port matching
move skip services from globals to argparse
move crawl_complete, name_lookup_complete to node.py
charlotte config 1 file to directory of yaml files
create default yaml file for argparse
rename ip -> instance_address
remove crawl strategy object from Node, denormalize (protocol, blocking)
merge hints into pre-existing children w/ unknown address
CORE: add sub commands for ['crawl', 'render-json']
CORE (OSS): unit tests tests tests (round I - excluding provider_*.py and crawl.py)

V2.bugs

BUG: nsq channels on same node are not grouping, again!
there is a regression in cycle detection - spider against async-cake-handler to repro
trim double quotes from service_name
BUG: crawl of well known port is discovering random connections to frontend-routers, ELBs - fixed by chris r. source ephemeral port filter
'CYCLE': f"service '{node['service_name']}' discovered as a parent of itself!",
paramiko nested exception outputting
handle actually null (absent value) nsq consumer in additionn to string literal "null"
ascii renderer grouping by detail is persisting in memory (groupings)
charlotte: move name parser expections (mysql-main) into charlotte
we see many repeating group by service-name NSQ topic/channels repeating in ascii renderer
catch timeout for crawling children
remove trailing _ from node_ref
graphviz blocking is backwards
regression defunct in parser check on num_connections == 0 is failing
differentiate RDS databases found in AWS - currently all show as rdsnetworkinterface
BUG: add type to json serialization - currently brittle: key-ing off of random fields for deserialization
infinited recursion bug introduced by the crawl hints. it had to do with the cached_nodes in crawl.py being by_ref object and a deep-ish copy fixed
trying to crawl json that was outputted with --depth arg results in hanging wait_for_crawl to complete on nodes

V3 Kubernetes++

CRAWL: kubernetes - take a hint
CRAWL: kubernetes - name lookup, crawl
support EKS cluster in a different AWS account than provider_aws

V3.refactor

static code analysis (prospector) and forthcoming changes
refactor providers to objects, remove SSH logic from crawl.py
caching children in crawl.py instead of providers!!
fix TIMEOUT logic
put provider_args back in crawl strategies! use **kwargs to pass args in code
rewrite provider registration
move provider constant refs from constants.py into providers
rename errors.NULL_IP NULL_ADDRESS
refactor signature of crawl_downstream to include address
replace pass through node_ref in crawl w/ zip()
unit tests for crawl, providers, provider_*?
validate that crawl strategies are only used for specified providers
refactor lookup_name to remove life360 business logic from providers!
remove ProviderInterface::configure(), have ssh configure itself on first query
seed provider is configurable command line arg w/

V3.features

FEATURE: make instance_provider args for aws hints part of a refactored "profile"
FEATURE: Distinguish kubernetes service shape in graphviz
add --stop-on-nonblocking CLI arg

V3.bugs

not respecting CrawlStrategy.providers
need to be able to configure different AWS profile for k8s/eks than for aws! (for dev)
BUG: intermittent timeout exceptions which do not result in program exit

V4.VOSS

REFACTOR: (providers): providers as plugin architecture
REFACTOR (spider): --concurrency -> --ssh-concurrency OR provider args
REFACTOR: (all): refactor package architecture
TIMEOUT: (crawl) robust provider timeout and exception handling
OBSCURIFIER (render_*): obscurifier for output
BUG: fix namespace package not being include in dist

V5.PROMVIZ

[~] promviz render output
- render nsq
- haproxy http enabled in prod
- render haproxy
- render proxysql
- BUG: geonames orphaned due to no data returned query
- render haproxy tcp mode
- render elasticache
- render kinesis
- render custom queries
merge hints
add missing hints
render_promviz tests
refactor renderers to plugins
fix plugin imports
refactor/DRY providers/renderes to plugin_core.py
how to organize plugin tests?
move constants.ARGS to cli_args.ARGS
update examples plugins/crawl strategies/docs
[~] PLUGINS: BUG namespace plugins aren't pip install --editable-able

V5.1 NICETOHAVES

ci/cd run tests
ci/cd publish pypy package
annotate services w/ links to wiki/github

Backlog

New Features

Core

RENDER_PLUGINS: make renderer's an abstract class w/ plugins
REFACTOR: move seed logic out of ./spider.py
REFACTOR: revisit the Node{Protocol, CrawlStrategy, protocol_mux} object relationship strategy
FEATURE: track whether a node was skipped for crawling and display as such in graphviz
REFACTOR: move errors/warnings to a global config
REFACTOR: do not block crawl() on lookup_name() in main crawl loop. will speed up many times
REFACTOR: move mutex from provider_ssh to crawl.py
BUG: intermittent timeouts crawling the whole tree - add retry to lookup_name/crawl_downstream?
BUG: remove blocking from CrawlStrategy - it should only be in Protocol
BUG: where is elasticache-time-points? crawl-netstat only takes 1 ip per port, so for async-soa which has 2 downstreams on 6379, it can't find
BUG: where is cx-dvb??
REFACTOR: consolidate Node::crawl_complete and crawl.py::_crawlable()
BUG: required args showing as optional in --help
DOCS: remove non obfuscated example video from README
LOGGER: rewrite logger access for community standards
PLUGPLAY: out of the box functionality by moving TCP to a "builtin" CrawlStrategy and using hostname or default service name
REFACTOR: (providers): rewrite take_a_hint to not return a list, just return a single NodeTransport
DOCS: rewrite docs in sphinx style and prepare for export to readthedocs.org
FEATURE: a new render format that has a predictable sort order, and on top of that the ability to diff

Renderers

test coverage for renderers.py

Remder Ascii

FEATURE: merge hints in ascii output

Render Graphviz

FEATURE: multiple seeds display with equal ranking
FEATURE: nsq topics as nodes rather than edges
FEATURE: visualize cycles
FEATURE: different visualization for cache vs database
FEATURE: create a legend

Render JSON

Render New

DISPLAY: output in vizceral format
DISPLAY: 'diff' run on multiple seed nodes and diff!

CrawlStrategies

BUG: HAProxy: functionality to detect bad HAProxy Config as a crawl error was lost in async refactor if stdout.startswith('ERROR:'): return 'CRAWL ' + stdout.replace("\n","\t"), {}
BUG: NSQ: misconfigured clients have null server (this is why we don't see rattail -> relapse), investigate & resolve
FEATURE: Netstat: use matchAddress for HAProxy crawl strategies to avoid timeout to RDS hostnames
FEATURE: crawl downstream - ability to specify more providers args per provider (so that k8s can selectively crawl containers)
FEATURE: detect multiple downstream on same port with NetstatCrawlStrategy - it will only pick up the first

Providers

BUG: cli arg --disable-providers is broken

Provider SSH

FEATURE: revisit whether occupy_one_sempahore_space is working (to dynamically configure --concurrency)
FEATURE: still getting ssh connections errors sometimes with out --concurrency=10
FEATURE: configurable "~/.ssh/config" SSH profile
REFACTOR (provider_ssh): we shouldn't use known_hosts=None for security reasons
TEST: write tests for provider_ssh

Provider AWS

FEATURE: lookup_name is slow, use async
CRAWL: dynamodb
CRAWL: SQS
TEST: write tests for provider_aws

Provider K8S

TEST: write tests for provider_k8s

Charlotte

FEATURE (charlotte): yaml validation by schema

Web

Trash Can

backwards compatability for haproxy w/out stats socket
detect live traffic netstat/tcpdump/ebpf? (this was solved by using haproxy stats)
remove crawl_strategy from Node()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO.md

TODO.md

V2.async

V2.features

V2.refactor

V2.bugs

V3 Kubernetes++

V3.refactor

V3.features

V3.bugs

V4.VOSS

V5.PROMVIZ

V5.1 NICETOHAVES

Backlog

New Features

Core

Renderers

Remder Ascii

Render Graphviz

Render JSON

Render New

CrawlStrategies

Providers

Provider SSH

Provider AWS

Provider K8S

Charlotte

Web

Trash Can

Files

TODO.md

Latest commit

History

TODO.md

File metadata and controls

V2.async

V2.features

V2.refactor

V2.bugs

V3 Kubernetes++

V3.refactor

V3.features

V3.bugs

V4.VOSS

V5.PROMVIZ

V5.1 NICETOHAVES

Backlog

New Features

Core

Renderers

Remder Ascii

Render Graphviz

Render JSON

Render New

CrawlStrategies

Providers

Provider SSH

Provider AWS

Provider K8S

Charlotte

Web

Trash Can