`# V1
- output the graph basic
- graph output prettier
- combine http and https backends
- add --max-depth
- detect service name from /etc/chef/client.rb
- add nsq graph
- add --skip-nsq-topics
- detect defunct children via haproxy stats
- skip display defunct children
- haproxy 1.6 and 1.8 compatability
- add error state for "null" NSQ clients (use "?")
- detect missing stats socket haproxy
- task: validate missing haproxy stats socket config is live manually with knife ssh
- task: validate missing haproxy stats socket is live manually with knife ssh
- add "no consumer" NSQ detection
- task: validate DEFUNCT-ness
- multiple seeds
- use asyncssh (10x faster!)
- remove
return_exceptions=True
- use global BASTION connection (ronf/asyncssh#270)
- limit concurrency w/ semaphore
- split to modules
- re-use ssh connection for get-name/get-config calls
- pass lightweight node-ref through async calls instead of node dict
- remove pending node print
- deal with formatting/output-ordering implications
- convert recursive crawl from
await
toensure_future
- improve live output rendering
- fix introduced parent['last_sibling'] bug
- bug: cycle is correct in the tree, but rendering zombie children (only for first level cycles?)
- retry ssh connection 3 times, fine tune concurrency
- introduced: --output=stdout is now broken due to render_node_live
- rename water to water_spout, private module function
- consolidate
find..children
error checking - validate frontend-router
- move connection semaphore to ssh_layer
- better trace/debug log levels
- consolidate nsq node relationships w/ multiple connections
- deal w/ SSH config: bastion & username
- refactors from PR review (reduce complexity, procedural styling)
- DISPLAY: output in json
- DISPLAY: load json file
- DISPLAY: output in graphviz
- DISPLAY: graphviz source
- CRAWL: detect proxysql
- CRAWL: cassandra
- CRAWL: detect well known ports w/ netstat & AWS name lookup (cx, memcache, redis)
- CRAWL: detect postgres well known port - causing trouble w/ name lookup
- CRAWL: user defined links
- move hints/skips to web.yaml
- keep config.yaml
- CRAWL: kinesis
- move grouping of nsq topics to application layer, on service_name instead of IP
-
config_errors
->warnings
,crawl_errors
->errors
- refactor ssh config to ssh config file
- refactor --hide-defunct to --skip-defunct and do not even (crawl)
- graphviz warn/error color coding
- remove "cruft" handling
- add quick filter to rewrite service_name mysql-main-port_3306 to mysql-main-r/o
- create objects or named tuples (dataclasses!)
- PEP8, 120 line length
- CHARLOTTE: make the
get_config
function into configurable parsers definable in YAML - charlotte: replace 'null' response from NSQ for missing IP w/ actual None response
- charlotte: move crawl strategy exceptions (frontend-router) into charlotte
- charlotte: move blocking logic to charlotte
- charlotte: rename crawl_strategy -> crawl_provider on Node()
- charlotte: move service_name_rewrite to charlotte
- rename protocol_detail -> protocol_mux
- CHARLOTTE: --skip-{name} arguments
- --skip-defunct -> --hide-defunct
- refactor database named matching to port matching
- move skip services from globals to argparse
- move crawl_complete, name_lookup_complete to node.py
- charlotte config 1 file to directory of yaml files
- create default yaml file for argparse
- rename
ip
->instance_address
- remove crawl strategy object from Node, denormalize (protocol, blocking)
- merge hints into pre-existing children w/ unknown address
- CORE: add sub commands for ['crawl', 'render-json']
- CORE (OSS): unit tests tests tests (round I - excluding
provider_*.py
andcrawl.py
)
- BUG: nsq channels on same node are not grouping, again!
- there is a regression in cycle detection - spider against async-cake-handler to repro
- trim double quotes from service_name
- BUG: crawl of well known port is discovering random connections to frontend-routers, ELBs - fixed by chris r. source ephemeral port filter
-
'CYCLE': f"service '{node['service_name']}' discovered as a parent of itself!",
- paramiko nested exception outputting
- handle actually null (absent value) nsq consumer in additionn to string literal "null"
- ascii renderer grouping by detail is persisting in memory (groupings)
- charlotte: move name parser expections (mysql-main) into charlotte
- we see many repeating group by service-name NSQ topic/channels repeating in ascii renderer
- catch timeout for crawling children
- remove trailing
_
from node_ref - graphviz blocking is backwards
- regression defunct in parser check on num_connections == 0 is failing
- differentiate RDS databases found in AWS - currently all show as
rdsnetworkinterface
- BUG: add type to json serialization - currently brittle: key-ing off of random fields for deserialization
- infinited recursion bug introduced by the crawl hints. it had to do with the cached_nodes in crawl.py being by_ref object and a deep-ish copy fixed
- trying to crawl json that was outputted with --depth arg results in hanging
wait_for_crawl
to complete on nodes
- CRAWL: kubernetes - take a hint
- CRAWL: kubernetes - name lookup, crawl
- support EKS cluster in a different AWS account than provider_aws
- static code analysis (prospector) and forthcoming changes
- refactor providers to objects, remove SSH logic from crawl.py
- caching children in crawl.py instead of providers!!
- fix TIMEOUT logic
- put provider_args back in crawl strategies! use **kwargs to pass args in code
- rewrite provider registration
- move provider constant refs from constants.py into providers
- rename errors.NULL_IP NULL_ADDRESS
- refactor signature of
crawl_downstream
to include address - replace pass through node_ref in crawl w/
zip()
- unit tests for crawl, providers, provider_*?
- validate that crawl strategies are only used for specified providers
- refactor lookup_name to remove life360 business logic from providers!
- remove ProviderInterface::configure(), have ssh configure itself on first query
- seed provider is configurable command line arg w/
- FEATURE: make instance_provider args for aws hints part of a refactored "profile"
- FEATURE: Distinguish kubernetes service shape in graphviz
- add --stop-on-nonblocking CLI arg
- not respecting CrawlStrategy.providers
- need to be able to configure different AWS profile for k8s/eks than for aws! (for dev)
- BUG: intermittent timeout exceptions which do not result in program exit
- REFACTOR: (providers): providers as plugin architecture
- REFACTOR (spider): --concurrency -> --ssh-concurrency OR provider args
- REFACTOR: (all): refactor package architecture
- TIMEOUT: (crawl) robust provider timeout and exception handling
- OBSCURIFIER (render_*): obscurifier for output
- BUG: fix namespace package not being include in dist
- [~] promviz render output
- render nsq
- haproxy http enabled in prod
- render haproxy
- render proxysql
- BUG: geonames orphaned due to no data returned query
- render haproxy tcp mode
- render elasticache
- render kinesis
- render custom queries
- merge hints
- add missing hints
- render_promviz tests
- refactor renderers to plugins
- fix plugin imports
- refactor/DRY providers/renderes to plugin_core.py
- how to organize plugin tests?
- move constants.ARGS to cli_args.ARGS
- update examples plugins/crawl strategies/docs
- [~] PLUGINS: BUG namespace plugins aren't pip install --editable-able
- ci/cd run tests
- ci/cd publish pypy package
- annotate services w/ links to wiki/github
- RENDER_PLUGINS: make renderer's an abstract class w/ plugins
- REFACTOR: move seed logic out of ./spider.py
- REFACTOR: revisit the Node{Protocol, CrawlStrategy, protocol_mux} object relationship strategy
- FEATURE: track whether a node was skipped for crawling and display as such in graphviz
- REFACTOR: move errors/warnings to a global config
- REFACTOR: do not block crawl() on lookup_name() in main crawl loop. will speed up many times
- REFACTOR: move mutex from provider_ssh to crawl.py
- BUG: intermittent timeouts crawling the whole tree - add retry to lookup_name/crawl_downstream?
- BUG: remove
blocking
from CrawlStrategy - it should only be in Protocol - BUG: where is
elasticache-time-points
? crawl-netstat only takes 1 ip per port, so for async-soa which has 2 downstreams on 6379, it can't find - BUG: where is
cx-dvb
?? - REFACTOR: consolidate Node::crawl_complete and crawl.py::_crawlable()
- BUG: required args showing as optional in --help
- DOCS: remove non obfuscated example video from README
- LOGGER: rewrite logger access for community standards
- PLUGPLAY: out of the box functionality by moving TCP to a "builtin" CrawlStrategy and using
hostname
or default service name - REFACTOR: (providers): rewrite take_a_hint to not return a list, just return a single NodeTransport
- DOCS: rewrite docs in sphinx style and prepare for export to readthedocs.org
- FEATURE: a new render format that has a predictable sort order, and on top of that the ability to diff
- test coverage for renderers.py
- FEATURE: merge hints in ascii output
- FEATURE: multiple seeds display with equal ranking
- FEATURE: nsq topics as nodes rather than edges
- FEATURE: visualize cycles
- FEATURE: different visualization for cache vs database
- FEATURE: create a legend
- DISPLAY: output in vizceral format
- DISPLAY: 'diff' run on multiple seed nodes and diff!
- BUG: HAProxy: functionality to detect bad HAProxy Config as a crawl error was lost in async refactor
if stdout.startswith('ERROR:'): return 'CRAWL ' + stdout.replace("\n","\t"), {}
- BUG: NSQ: misconfigured clients have null server (this is why we don't see rattail -> relapse), investigate & resolve
- FEATURE: Netstat: use matchAddress for HAProxy crawl strategies to avoid timeout to RDS hostnames
- FEATURE: crawl downstream - ability to specify more providers args per provider (so that k8s can selectively crawl containers)
- FEATURE: detect multiple downstream on same port with NetstatCrawlStrategy - it will only pick up the first
- BUG: cli arg --disable-providers is broken
- FEATURE: revisit whether
occupy_one_sempahore_space
is working (to dynamically configure --concurrency) - FEATURE: still getting ssh connections errors sometimes with out --concurrency=10
- FEATURE: configurable "~/.ssh/config" SSH profile
- REFACTOR (provider_ssh): we shouldn't use known_hosts=None for security reasons
- TEST: write tests for provider_ssh
- FEATURE: lookup_name is slow, use async
- CRAWL: dynamodb
- CRAWL: SQS
- TEST: write tests for provider_aws
- TEST: write tests for provider_k8s
- FEATURE (charlotte): yaml validation by schema
- backwards compatability for haproxy w/out stats socket
- detect live traffic netstat/tcpdump/ebpf? (this was solved by using haproxy stats)
- remove crawl_strategy from Node()