DB Engines Types
Ext.Links
@[https://db-engines.com/]
@[https://db-engines.com/en/ranking]
@[https://dzone.com/articles/rant-there-is-no-nosql-data-storage-engine]

TODO
@[https://www.infoq.com/news/2018/09/pinterest-goku-timeseries-db]
  Pinterest Switches from OpenTSDB to Their Own Time Series Database
- https://facebook.github.io/prophet/
  Prophet: TSDB forecasting library (wrapper around Stan), particularly
  approachable place to use Bayesian Inference for forecasting use cases
  general purpose.
  - Prophet is a procedure for forecasting time series data based on an
    additive model where non-linear trends are fit with yearly, weekly, and daily
    seasonality, plus holiday effects. It works best with time series that have
    strong seasonal effects and several seasons of historical data. Prophet is
    robust to missing data and shifts in the trend, and typically handles outliers well.

@[https:// technologyconversations.com/2015/09/08/service-discovery-zookeeper-vs-etcd-vs-consul/]
@[https://www.infoq.com/news/2019/05/hashicorp-consul-1.5.0]
dbEngine: Data Topology compared
  Key-value:    wide-Column:         Document
  o -- O        o -- O O O O ... O   O→O→O...
  o -- O        o -- O O O O ... O    ↘    
  o -- O        o -- O O O O ... O     O→O
  ...           ...                     ↘   
                                         O→O
                                          ↘ 
                                           O  
Graph DBMS
- represent data in graph structures of nodes (entities) and edges(relations).
- Relations are are first class citizens (vs foreign keys in SQL ddbbs, 
  "Relational DDBBs lack relationships, some queries are very easy, some
   others require complex joins" ).

- BºModeling real world is easier than in SQL/RDBMs.  º
  BºAlso, they are resillient to model/schema changes.º
- Sort of "hyper-relational" databases.
- Two  properties should  be considered in graph DDBBs:
  └ºunderlying storageº:
    - native optimized graph storage 
    - Non native (serialization to relational/Document/key-value/...  ddbbs.
  └ºprocessing engineº:
    -ºindex-free adjacencyº: ("true graph ddbb"), connected  nodes  physically
      "point"  to  each  other  in  the  database.
    BºMuch faster when doing graph-like queries.º Query performance 
      remains (aproximately) constant in "many-joins" scenarios, while RDBMs
      degrades exponentially with JOINs.
      Time is proportional to the size of the path/s traversed by the query,
      (vs subset of the overall table/s).
    Rº(usually) they DON'T provide indexes on all nodes:º
    Rº- Direct access to nodes based on attribute values is not possibleº
    Rº  or slower.º
    -ºnon-index-free adjacencyº: External indexes are used to link nodes.
    BºLower memory use.º 

          Native |·Infi.Graph        ·OrientDB ·Neo4j
                 |·Titan             ·Affinity ·*dex
           ^^^^^ |
           Graph |
      Processing |                   ·Franz Inc
           vvvvv |                   ·Allegro Graph
                 |·FlockDB           ·HyperGraphDB
                 └---------------------------------
              non         ← Graph  →      Native
              Native        Storage
- Graph Comute Engines execute graph algorithms against large datasets. 
  - Example algorithms include:
    - Idetify clusters in data
    - Answer "how many average relationships do node have"
    - Find "paths" to get from one node to another.
 
  - It can be independent of the underlying storage or not.
  - They can be classified by:
    - in-memory/single machine (Cassovary,...)
    - Distributed: Pegasys, Giraph, ... Most of the based on the white-paper
    BºPregel: A system fro large-scale graph processingº (2010 ACM)
    @[https://dl.acm.org/doi/10.1145/1807167.1807184]
    "Many practical computing problems concern large graphs. ... billions of 
     vertices, trillions of edges ... In this paper we present a
     computational model suitable for this task. "


- GraphQL ISO standard Work in place integrating the 
  ideas from  openCypher (Neo4j contribution), PGQL, GSQL,
  and G-CORE languages.
  - @[https://www.gqlstandards.org/]
- OpenCypher support:
  - Neo4jDB        - Cypher for Spark/Gremin
  - AgensGraph     - Mengraph
  - RedisGraph     - inGraph
  - SAPHANA Graph  - Cypher.PL

ºExamplesº
- Neo4j
- AGE: @[https://www.postgresql.org/about/news/2050/]
  - multi-model graph database extension for PostgreSQL 
  - Support for Subset of Cypher Expressions through
    @[https://www.opencypher.org/]
- AWS Neptune
- Azure Cosmos DB
- Datastax Enterprise
- OrientDB
- ArangoDB


- there areºtwo primary implementationsºboth consisting of
  nodes(or "vertices") and directed edges (or "links"). 
  Data models looking slightly different on each implementation.
  Both allow properties (attribute/value pairs) to be
  associated with nodes.
  -ºgraphs—property graphs (PG)º
    They also allow attributes for edges.
    - no open standards exist for schema definition,
      query languages, or data interchange formats. 
  -ºW3C Resource Description Framework (RDF)º
    Node properties treated simply as more edges.
    (techniques exists to express edge properties in RDF).
    - graphs can be represented as edges or "triples":
      edge triple = ( starting point, label     , end point)  or
      edge triple = (subject        , "predicate, object   )
      (Also called "triple stores")
    - RDF forms part of (a suite of) standardized W3C specs.
      built on other web standards, collectively known as
    BºSemantic Web, or Linked Dataº. They include:
      - schema languages (RDFS and OWL):
      - declarative query language (SPARQL)
      - serialization formats
      - Supporting specifications:
        - mapping RDBMs to RDF graphs
      -Bºstandardized framework for inferenceº
       Bº(ex, drawing conclusions from data)º

  PG resemble conventional data structures in an isolated
  application or use case, whereas RDF are originally designed
  to support interoperability and interchange across 
  independently developed applications.



  - (@[https://www.allthingsdistributed.com/2019/12/power-of-relationships.html])
    "...Because developers ultimately just want to do graphs, you 
     can choose to do fast Apache TinkerPop Gremlin traversals for
     property graph or tuned SPARQL queries over RDF graphs..."

   - TinkerPop:
   @[http://tinkerpop.apache.org/]
     open source, vendor-agnostic, graph computing framework.
     When a data system is TinkerPop-enabled, users are able
     to model their domain as a graph and analyze that graph
     using the Gremlin graph traversal language. 
     ...Sometimes an application is best served by an in-memory, 
     transactional graph database. Sometimes a multi-machine
     distributed graph database will do the job or perhaps the
     application requires both a distributed graph database for
     real-time queries and, in parallel, a Big(Graph)Data processor
     for batch analytics. 

Distributed Graph DB
Query
@[https://www.infoq.com/presentations/graph-query-distributed-execution/]
OWL
https://en.wikipedia.org/wiki/Web_Ontology_Language

https://www.w3.org/wiki/Ontology_editors

The Web Ontology Language (OWL) is a family of knowledge 
representation languages for authoring ontologies. Ontologies are a 
formal way to describe taxonomies and classification networks, 
essentially defining the structure of knowledge for various domains: 
the nouns representing classes of objects and the verbs representing 
relations between the objects. Ontologies resemble class hierarchies 
in object-oriented programming but there are several critical 
differences. Class hierarchies are meant to represent structures used 
in source code that evolve fairly slowly (typically monthly 
revisions) whereas ontologies are meant to represent information on 
the Internet and are expected to be evolving almost constantly. 
Similarly, ontologies are typically far more flexible as they are 
meant to represent information on the Internet coming from all sorts 
of heterogeneous data sources. Class hierarchies on the other hand 
are meant to be fairly static and rely on far less diverse and more 
structured sources of data such as corporate databases.[1]


http://owlgred.lumii.lv/


https://www.cognitum.eu/Semantics/FluentEditor/  
RDBMS
ºRDBMSº
ºFeaturesº
- support E-R data model.
- table schema defined by the
  table name and fixed number
  of attributes and data types.
- A record (=entity) corresponds
  to a row in the table.
- basic operations are defined on
  the tables/relations:
  - CRUD: Create/Read/Update/Delete
  - set ops: union|intersect|difference
  - subset selection defined by filters
  - Projection of subset of table columns
  - JOIN: combination of:
    Cartesian_product+selection+projection
  - TX ACID control
  - user management
- Ops defined in  standard SQL
ºHistoryº
- beginning of 1980s
- Most widely used DBMS

ºMain  Examplesº
- PostgreSQL
- Oracle
- MySQL/MariaDB
  - @[http://myrocks.io/] ← Built on top of key-value RocksDB.
                            optimized for SSD/Memory.
- SQLite / DQLite
- TiDB
  - MySQL compatible
  - RAFT-distributed
  - Rust/go written
  - Features:
    - "infinite" horizontal scalability
    - strong consistency
    - HA
- SQL Server
- DB2
- Hive (https://db-engines.com/en/system/Hive)
  - Home: https://hive.apache.org/
  - RºWARNº: No Foreign keys, NO ACID
  -ºEventual Consistencyº
  - data warehouse software facilitates reading, 
    writing, and managing large datasets residing in 
    distributed storage using SQL.  (Hadoop,...)
  - 2.0+ support Spark as execution engine.
  - Implemented in Java
  - supports analysis of large datasets
    stored in Hadoop's HDFS and
    compatible file systems such as
    Amazon S2 filesystem.
  - SQL-like DML and DDL statements
    Traditional SQL queries implemented in
    MapReduce Java API to  execute SQL apps:
    - necessary SQL abstraction provided to
      integrate SQL-like queries (HiveQL) into
      the underlying Java without the need
      for low-level queries.
  - Hive aids portability of SQL-based apps
    to Hadoop
  - JDBC, ODBC, Thrift

- SQL "summary":
  Beginner:                    Intermediate:AA                                 Advanced:
   1 What and Why of SQL?       5 LIKE, AND/OR and DISTINCT                     9  JOINT ... UNIONS ...
   2 Types of SQL               6 NULLs and Dates                              10  OVER ... PARTITION BY ...
   3 Table, Rows and Colums     7 COUNT, SUM, AVG ... GROUP BY ... HAVING...   11  STRUCT ... UNNEST ..
   4 SELECT ... FROM ...        8 Alias, Sub-queries and WITH ...AS ..         12  Reguar Expresions
       WHERE ... ORDER BY ...





dqlite.io
https://dqlite.io/
Dqlite (“distributed SQLite”) extends SQLite across a cluster of machines,
with
automatic failover and high-availability to keep your application running. It
uses C-Raft, an optimised Raft implementation in C, to gain high-performance
transactional consensus and fault tolerance while preserving SQlite’s
outstanding efficiency and tiny footprint.
SchemaSpy
@[http://schemaspy.org/]
- SchemaSpy is generates database to HTML documentation, including Entity Relationship diagrams.
KEY-VALUE
ºKEY-VALUE STORESº
-ºsimplestºform of DBMS.
- store pairs of keys and values
-ºhigh performanceº
- not adequate for complex apps.
  Low data consistency in distributed clusters,
  when compared to Registry Based.
- value is not ussually "big" (when compared
  to wide-column DDBBs like Cassandra,..). 
  Not designed for high-data-layer apps, but for low-level software tasks.
- extended forms allows to sort the keys, enabling range queries as well as an 
  ordered processing of keys.
- Can evolve to document stores and wide column stores.

ºExamplesº
-ºRedisº: Cache/RAM-Memory oriented. Not strong cluster consistency 
          like etcd / Consul / Zookeeper.
  - Comparative: redis (cache/simple key-value ddbb) 
                 vs
                 etcd  (key-value Registry DDBB):
    - etcd is designed with high availability in mind,
      being distributed by default - with RAFT consensus algorithm -
      and providing consistency through node failures.
      redis doesn't. 
    - etcd is persisted to disk by default. Redis is not.
     (logs and snapshots are possibles) 
    - Redis/memecached is a blazing fast in-memory key-value store 
    - etcd read performance is good for most purposes, but  1-2 orders
      of magnitude lower than redis. Write gap is even bigger.

  - Comparative: redis     (cache/simple key-value ddbb) 
                 vs
                 memcached (cache/simple key-value ddbb) 
    - redis: able to store typed data (lists, hashes, ...)
      and supporting lot of different programming patterns out of
      the box, while memcached doesn't.

-ºAmazon DynamoDBº
-ºMemcachedº

-ºGuava Cacheº
@[https://github.com/google/guava/wiki/CachesExplained]
  - ºnon-distributedº easy-to-use Java library for data caching
  - A Cache is similar to ConcurrentMap, but not quite the same. The most
    fundamental difference is that a ConcurrentMap persists all elements that
    are added to it until they are explicitly removed. A Cache on the other
    hand is generally configured to evict entries automatically, in order to
    constrain its memory footprint. In some cases a LoadingCache can be useful
    even if it doesn't evict entries, due to its automatic cache loading.

-ºHazelcastº
  - Designed for caching.
  - Hazelcast clients, by default, will
    connect to all cache cluster nodes
    an know about the cluster partition
    table to route request to the correct
    node.
  - Java JCache compliant
  - Advanced cache eviction algorithms
    based on heuristics with sanitized O(1)
    runtime behavior.

- LabelDB ("Ethereum State"): Non distributed, with
  focus in local-storage persistence.
- TiKV
  - dev lang:ºRustº
  - Incubation Kubernetes project
  - used also for TiDB
-ºRocksDB:º(by Facebook)
  - built on earlier work on LevelDB
  - core building block for fast key-value server.
  - Focus: storing data on flash drives.
  - It has a Log-Structured-Merge-Database (LSM) design with
    flexible tradeoffs between Write-Amplification-Factor (WAF),
    Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF).
  - It has multi-threaded compactions, making itºespecially suitableº
   ºfor storing multiple terabytes of data in a singleº
   ºlocal (vs distributed) database.º.
   ºIt can be used for big-data processing on single-node infrastructureº
  - Used by Geth/Besu/... and other Ethereum clients ...
Search
- "NoSQL" DBMS for search of content
- Optimized for:
  - complex search expressions
  - Full text search
  - reducing words to stem
  - Ranking and grouping of results
  - Geospatial search
  - Distributed search for high
    scalability
ºExamplesº
- Elasticsearch
- Splunk
- Solr
- MarkLogic
- Sphinx
- Eclipse Hawk:
  projects.eclipse.org/projects/modeling.hawk
  heterogeneous model indexing framework:
  - indexes collections of models
    transparently and incrementally into
    NoSQL DDBB, which can be queried
    efficiently.
  - can mirror EMF, UML or Modelio
    models (among others) into a Neo4j or
    OrientDB graph, that can be queried
    with native languages, or Hawk ones.
    Hawk will watch models and update
    the graph incrementally on change.
Time Series
- Managing time series data
- very high TX load:
  - designed to efficiently collect,
    store and query various time
    series.
- Ex use-case:
  SELECT SENSOR1_CPU_FREQUENCY / SENSOR2_HEAT'
  joins two time series based on the
  overlapping areas of time providing
  new time-serie

ºExamplesº
- Timescale.com
  - PostgreSQL optimized for Time Series.
    by modifying the insert path,
    execution engine, and query planner
    to "intelligently" process queries
    across chunks.
- QuestDB:
  - https://questdb.io/
  - RelationalModel.
  - PostgresSQL wire protocol.(TODO)
  - "Query 1.6B rows in millisecs"
  - Relational Joins + Time-Series Joins
  - Unlimited Transaction Size
  - Cross Platform.
  - Java Embedded.
  - Telegraf TCP/UDP

- InfluxDB
- Kdb+
- Graphite. Simple system that will just:
  - Store numeric time series data
  - Render graphs of this data
  It willºNOTº:
  - collect data (Carbon needed)
  - render graphs (external apps exists)
- RRDtool
- Prometheus
  - https://prometheus.io/
  - See also: Cortex: horizontally scalable, highly available,
                      multi-tenant, long term storage for Prometheus.
  - CNCF project,ºkubernetes friendlyº
  - GoLang based:
    - binaries statically linked
    - easy to deploy.
  - Many client libraries.
  - monitoring metrics analyzer
    and alerting
  - Highly dimensional data model.
    Time series are identified by a
    metric name and a set of
    key-value pairs.
  - stores all data as time series:
    streams of timestamped values
    belonging to same metric and
    same set of labeled dimensions.
  - Multi-mode data visualization.
  - Grafana integration
  - Built-in expression browser.
  - PromQL Query language allowing
    to select+aggregate time-series
    data in real time.
    - result can either be shown as
      graph, tabular data or consumed
      by external HTTP API.
  - Precise alerts based on PromQL.
  - scaling:
    - sharding
    - federation
  - "Singleton" servers relying only
    on local storage.
    (optionally remote storage).

- Uber M3
  www.infoq.com/news/2018/08/uber-metrics-m3
  - Large Scale Metrics Platform
""" built to replace Graphite+Carbon cluster,
    and Nagios for alerting and Grafana for
    dashboarding due to issues like
    poor resiliency/clustering, operational
    cost to expand the Carbon cluster,
    and a lack of replication""".
  ºFeaturesº
  - cluster management, aggregation,
    collection, storage management,
    a distributed TSDB
  - M3QL query language (with features
    not available in PromQL).
  - tagging of metrics.
  - local/remote integration similar to
    the Prometheus Thanos extension providing
    cross-cluster federation, unlimited
    storage and global querying across
    clusters, works.
  - query engine: single global view
    without cross region replication.

- Redis TimeSeries:
  - By default Redis is just a key-value store.
  @[https://redislabs.com/redis-enterprise/redis-time-series/]
    RedisTimeSeries simplifies the use of Redis for time-series use
    cases like IoT, stock prices, and telemetry.
  - See also:
  @[https://www.infoq.com/articles/redis-time-series-grafana-real-time-analytics/]
    How to Use Redis TimeSeries with Grafana for Real-time Analytics
TS Popularity
@[https://www.techrepublic.com/article/why-time-series-databases-are-exploding-in-popularity/]
M3DB
M3DB: (Uber) OOSS distributed timeseries database 
- ability to shard its metrics into partitions,
  replicate them by a factor of three,
  and then evenly disperse the replicas across separate 
  failure domains.
RDF stores
@[https://db-engines.com/en/article/RDF+Stores]
@[https://en.wikipedia.org/wiki/Resource_Description_Framework]
- Resource Description Framework stores
  ºdescribes information in triplets:º
 º(subject,predicate,object)º
- Originally used for describing
  IT-resources-metadata.
- Today often used in semantic web.
- RDS is a subclass of graph DBMS:
  (subject,predicate,object)
   ^          ^      ^
   node      edge   node
  but it offer specific methods
  beyond general graph DBMS ones. Ex:
  SPARQL, SQL-like query lang. for
  RDF data, supported by most
  RDF stores.

ºExamplesº
- MarkLogic
- Jena
- Virtuoso
- Amazon Neptune
- GraphDB
- Apache Rya:
JSON-LD
- format for mapping JSON data into the RDF semantic graph model, as defined by [JSON-LD]. 

@[https://en.wikipedia.org/wiki/JSON-LD
Apache Rya
@[https://searchdatamanagement.techtarget.com/news/252472464/Apache-Rya-matures-open-source-triple-store-database]
The project started at the U.S. government's Laboratory for Telecommunication
Sciences with an initial research paper published in 2012.

Among Rya's users is the U.S. Navy, which is using the open source technology
in a number of efforts, including one for autonomous drones. The project got
started because there was a need for a scalable RDF triple store and no
existing system that met all requirements, said Adina Crainiceanu, vice
president of Apache Rya and associate professor of computer science at the U.S.
Naval Academy.

Service Registry
 ºand Discoveryº

 - Specialized key/value DBMS:
   -ºtwo processes must existsº:
     -ºService registration processº
       storing final-app-service
       (host,port,...)
     -ºService discovery processº
       - let final-app-services query
         the data
   - other aspects to consider:
     - auto-delete of non-available services
     - Support for replicated services
     - Remote API is provided

 ºExamplesº
 -ºZooKeeperº
   - originated from Hadoop ecosystem.
     now is core part of most cluster based Apache
     projects (hadoop, kafka, solr,...)
   - data-format similar to file system.
   - cluster mode.
   - Disadvantages:
     - complex:
     - Java plus big number of dependencies
   - Still used by kafka for config but
     plans exists to replace.
   @[https://issues.apache.org/jira/browse/KAFKA-6598]
   @[https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum] - 2019-11-06 -
 -ºetcdº
   - distributed versioned key/value store
   - HTTPv2/gRPC remote API.
   - based on previous experiences with ZooKeeper.
   - hierarchical config.
   - Very easy to deploy, setup and use
     (implemented in go with self-executing binary)
   - reliable data persistence
   - very good doc
   - @https://coreos.com/blog/introducing-zetcd]
   - See etcd vs Consul vs ZooKeeper vs NewSQL
     comparative at:
   @[https://etcd.io/docs/v3.4.0/learning/why/]
     and:
   @[https://loneidealist.wordpress.com/2017/07/12/apache-zookeeper-vs-etcd3/]
   Disadvantages:
   - needs to be combined with few
     third-party tools for  serv.discover:
     - etcd-registrator: keep updated  list
                       of docker containers
     - etcd-registrator-confd: keep updated
                               config files
     - ...
   - Core-component of Kubernetes for cluster
     configuration.

 -ºConsulº
   - strongly consistent datastore
   - multidatacenter gossip protocol
     for dynamic clusters
   - hierarchical key/value store
   - adds the notion of app-service
     (and app-service-data).
     - "watches" can be used for:
       - sending notifications of
         data changes
       - (HTTP, TTLs , custom)
         health checks and
         output-dependent
         commands

   - embedded service discovery:
     no need to use third-party one
     (like etcd). Discovery includes:
     - node         health checks
     - app-services health checks
     - ...
   - Consul provides a built in framework
     for service discovery.
     (vs etcd basic key/value + custom code)
     - Clients just register services and
       perform discovery using the DNS
       or HTTP interface.
   - out of the box native support for
     multiple datacenters.
   - template-support for config files.
   - Web UI: display all services and nodes,
     monitor health checks, switch
     from one datacenter to another.
 - doozerd (TODO)
 See also:
 - Comparision chart:
 coreos.com/etcd/docs/latest/learning/why.html
etcd 101
"etcd" name refers to unix "/etc" folder and "d"istributed system.

- etcd is designed as a general substrate for large scale distributed systems. 
  These are systems that will never tolerate split-brain operation and are 
  willing to sacrifice availability to achieve this end. etcd stores metadata in 
  a consistent and fault-tolerant way. An etcd cluster is meant to provide 
  key-value storage with best of class stability, reliability, scalability and 
  performance.

@[https://github.com/etcd-io/etcd]
@[https://etcd.io/docs/v3.4.0/]

@[https://etcd.io/docs/v3.4.0/learning/] 
- designed to:
  reliably (99.999999%) store infrequently updated data 
  reliable watch queries.
- multiversion persistent key-value store of key-value pairs:
  - inexpensive snapshots 
  - watch history events ("time travel queries").
- key-value store is effectively immutable unless 
  store is compacted to to shed oldest versions.

Logical view
- flat binary key space.
- key space: 
  - lexically sorted index on byte string keys
    → soºrange queries are inexpensiveº.
  - Each key/key-space maintains multiple revisions starting with version 1
- revisions are indexed as well
  →ºranging over revisions with watchers is efficientº

- generation: key life-span, from creation to deletion.
  1 key ←→ 1+ generation.

Physical view:
- data stored as key-value pairs in a persistent b+tree.
-ºkey of key-value pair is a 3-tuple (revision, sub,     type)º.
                                                ^        ^
                                                Unique   (opt)
                                                key-ID   special key-type
                                                in rev.  (deleted key,..)




@[https://jepsen.io/analyses/etcd-3.4.3]
- In our tests, etcd 3.4.3 lived up to its claims for key-value operations: we 
  observed nothing but strict-serializable consistency for reads, writes, and 
  even multi-key transactions, during process pauses, crashes, clock skew, 
  network partitions, and membership changes. Strict-serializable behavior was 
  the default for key-value operations; performing reads with the serializable 
  flag allowed stale reads, as documented.

@[https://etcd.io/docs/v3.4.0/learning/design-client/]
- etcd v3+ is fully committed to (HTTPv2)gRPC.
  For example: v3 auth has connection based authentication, 
  rather than v2's slower per-request authentication.

Wide Column Stores
(also called extensible record stores)
-ºshort of two-dimensionalº ºkey-value stores.º
  where key set as well as value can be "huge"
  (thousands of columns per value)

  - Compared to Service Registry DDBBs (Zookeeper,
    etcd3) that are also key-value stored :
    - Ser.Registry ddbb scale much less (many
      times a single node can suffice) but are
      strongly consistent. Wide Column Stores
      can scale to ten of thousands of nodes but
      consistency is eventual / optimistic.
  - column names and record keys ºare not fixedº
                                  └─────┬─────┘
                                  (schema-free)
- not to be confused with column oriented storage
  of RDMS. Last one is an internal concept
  for improving performance storing
  table-data column-by-column vs
  record-by-record)
ºExamplesº
-ºCassandraº
  - SQL-like SELECT, DML and DDL statements (CQL)
  - handle large amounts of data across farm
    of commodity servers, providing high availability
    with no single point of failure. (peer-to-peer)
  - Allows for different replication strategies
    (sync/async for slow-secure/fast-less-secure)
    cluster replication.
  - logical infrastructure support for clusters
    spanning multiple datacenters in different 
    continents.
  -ºIntegrated Managed Data    º
   ºLayer Solutions with Spark,º
   ºKafka and Elasticsearch.   º

-ºScyllaº
  - compatible with Cassandra (same CQL and Thrift protocols,
     and same SSTable file formats)
    with  higher throughputs and lower latencies (10x).
  - C++14 (vs Cassandra Java)
  -ºcustom network managementºthat minimizes resource by
   ºbypassing the Linux kernel. No system call is requiredº
    to complete a network request. Everything happens in userspace.
  -ºSeastar async lib replacing threadsº
  - sharded design by node:
    - each CPU core handles a data-subset
    - authors claim to achieve much better
      performance on modern NUMA SMP archs,
      and to scale very well with the number
      of cores:
      -ºUp to 2 million req/sec per machineº
-ºHBaseº. Apache alternative to Google BigTable
  Internally uses skip-lists:
@[../programming_theory.html?query=e65f9917-5f27-4b78-8a5a-0653640b6b88]
_ºCosmosDBº
REF: Google bigTable-osdi06.pdf
-ºGoogle Spannerº
Cassandra 101

Cassandra whitepaper:
-ºdistributed key-value storeº
- developed at Facebook.
- data and processing spread out across many commodity servers
- highly available service without single point of failure
  allowing replication even across multiple data centers.
- Possibility to choose synchronous/asynchronous replication
  for each update.
-ºfully distributed DB, with no master DBº
- Reads are linearly scalable with the number of nodes.
  It scalates (much)better cassandra that comparables systems
  like hbase,voldermort voldtdb redis mysql
  Writes are linearyly scalable?

- based on 2 core technologies:
  - Google's Big Table
  - Amazon's Dynamo

- 2 versions of Cassandra:
  - Community Edition  : distributed under the Apache™ License
  - Enterprise Edition : distributed by Datastax

- Cassandra is a BASE system (vs an ACID system):
  BASE == (B)asically (A)vailable, (S)oft state, (E)ventually consistent)
  ACID == (A)tomicity, (C)onsistency, (I)solation, (D)urability.

Oº- BASE implies that the system is optimistic and accepts that   º
Oº  the database consistency will be in a state of fluxº
Oº- ACID is pessimistic and it forces consistency at the endº
Oº  of every transaction.º


               Replication
               Strategy
                   1
                   ↑
                   ↓
                   1
cluster 1 ←→ 1+ Keyspace 1 ←→ N Column Family   ←→ 0+ Row
             └┬┘                └─────┬─────┘
         Typically
         just one               sharing similar   - collection of
                                structure           sorted columns.

  ┌─ CQL lang. used to access data.
┌─┴───────────────────────────────────────────────────────────────┐
│OºKEYSPACEº: container for app.data, similar to a RDMS schema    │
│             ┌─────────────────────────────────────────────────┐ │
│             │ Column Family 1                                 │ │
│             └─────────────────────────────────────────────────┘ │
│             ┌─────────────────────────────────────────────────┐ │
│             │ Column Family 2                                 │ │
│             └─────────────────────────────────────────────────┘ │
│               ...                                               │
│ ┌─────────┐ ┌─────────────────────────────────────────────────┐ │
│ │Settings │ │ Column Family N      Col    Col2  ...  ColN     │ │
│ ├─────────┤ │                      Key1   Key2  ...  KeyN     │ │
│ │*Replica.│ │ ┌────────┐ ┌──────────────────────────────────┐ │ │
│ │ Strategy│ │ │Settings│ │RowKey1│Val1_1│Val2_1│   │ValN_1│ │ │ │
│ └─────────┘ │ │        │ │RowKey2│Val2_2│Val2_2│   │ValN_2│ │ │ │
│             │ └────────┘ │         ...                      │ │ │
│             │            └──────────────────────────────────┘ │ │
│             └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                            RowKey := Unique row UID
                                      Used also to sharde/balance
                                      load acrossºnodesº
                            ValX_Y := value ||
                                      value collection


     BASIC INFRASTRUCTURE ARCHITECTURE:
     ──────────────────────────────────
ºNodeº: (Cassandra peer instance)  ºRackº:
 - data storage (on File System     - Set of nodes
   or HDFS)
 - The data is balanced
   across nodes based
   on the RowKey f each
   Column Family.
 - Commit log to ensure
   data durability.
   - Sequentially written                    automatically
                                             replicated throughout
             ┌── Node "N" choosen            cluster
             │   based on RowKey                 ↑
             ↓                                ┌──┴───┐
Input → Commit Log → Index+write →(memTable → Write to    ──→ SSTable
        @Node N      toºmemTableº  full)     ºSSTableº        Compaction
                        └──┬───┘              └──┬───┘        └───┬────┘
           - in-memory structure   - (S)orted (S)tring Table  - Periodically consolidates
           - "Sort-of" write-back    in File System or HDFS     storage by discarding
             cache                                              data marked for deletion
                                                                with a tombstone.
                                                              - repair mechanisms exits
                                                                to ensure consistency
                                                                of cluster

ºDatacenterº:                 ºClusterº:
- Logical Set of Racks.       - 1+ Datacenters
- Ex: America, Europe,...     - full set of nodes
- Replication set at            which map to a single
  Data Center bounds            complete token ring

CQL
- similar syntax to SQL
- works with table data.
- Available Shells: cqlsh | DevCenter | JDBC/ODBC drivers

ºCoordinator Nodeº:
- Node where client connects for a read/write request
  acting as a proxy between the client and Cassandra
  internals.

Cassandra Key Components
 Gossip
    A peer-to-peer communication protocol in which nodes periodically exchange
state information about themselves and about other nodes they know about. This
is similar to hear-beat mechanism in HDFS to get the status of each node by the
master.

    A peer-to-peer communication protocol to share location and state
information about the other nodes in a Cassandra cluster. Gossip information is
also stored locally by each node to use immediately when a node restarts.

    The gossip process runs every second and exchanges state messages with up
to three other nodes in the cluster. The nodes exchange information about
themselves and about the other nodes that they have gossiped about, so all
nodes quickly learn about all other nodes in the cluster.

    A gossip message has a version associated with it, so that during a gossip
exchange, older information is overwritten with the most current state for a
particular node.

    To prevent problems in gossip communications, use the same list of seed
nodes for all nodes in a cluster. This is most critical the first time a node
starts up. By default, a node remembers other nodes it has gossiped with
between subsequent restarts. The seed node designation has no purpose other
than bootstrapping the gossip process for new nodes joining the cluster. Seed
nodes are not a single point of failure, nor do they have any other special
purpose in cluster operations beyond the bootstrapping of nodes.



Note: In multiple data-center clusters, the seed list should include at least
one node from each data center (replication group). More than a single seed
node per data center is recommended for fault tolerance. Otherwise, gossip has
to communicate with another data center when bootstrapping a node. Making every
node a seed node is not recommended because of increased maintenance and
reduced gossip performance. Gossip optimization is not critical, but it is
recommended to use a small seed list (approximately three nodes per data
center).
Data distribution and replication

How data is distributed and factors influencing replication.

In Cassandra, Data is organized by table and identified by a primary key, which
determines which node the data is stored on. Replicas are copies of rows.

Factors influencing replication include:
Virtual nodes (Vnodes):

Assigns data ownerommended for production. It defines a node’s data center
and rack and uses gossip for propagating this information to other nodes.

There are many vtypes of snitches like dynamic snitching, simple snitching,
RackInferringSnitch, PropertyFileSnitch, GossipingPropertyFileSnitch,
Ec2Snitch, Ec2MultiRegionSnitch, GoogleCloudSnitch, CloudstackSnitch.

TODO:
https://rene-ace.com/cassandra-101-understanding-what-is-cassandra/
- Seed node
- Snitch purpose
- topologies
- Coordinator node,
- replication factors,
- ...
Cassandra+Spark
@[https://es.slideshare.net/chbatey/1-dundee-cassandra-101]

Similar projects
- Voldermort: developed by Linked-In.
@[https://www.project-voldemort.com/voldemort/]

 cstart
Cassandra Orchestration Tool by Spotify
@[https://github.com/spotify/cstar]

@[www.infoq.com/news/2018/10/spotify-cstar]
""" ... Cstar emerged from the necessity
    of running shell commands on all host 
    in a Cassandra cluster .... 
    Spotify fleet reached 3000 nodes. Example
    scripts were:  
    - Clear all snapshots 
    - Take a new snapshot (to allow a rollback)
    - Disable automated puppet runs
    - Stop the Cassandra process
    - Run puppet from a custom branch of our git repo
      in order to upgrade the package
    - Start the Cassandra process again
    - Update system.schema_columnfamilies to the JSON format
    - Run `nodetool upgradesstables`, which depending on
      the amount of data on the node, could take hours to
      complete
    - Remove the rollback snapshot
"""

$ pip3 install cstart

REF: @[https://github.com/spotify/cstar]
"""Why not simply use Ansible or Fabric?
Ansible does not have the primitives required to run things in a topology aware 
fashion. One could split the C* cluster into groups that can be safely executed 
in parallel and run one group at a time. But unless the job takes almost 
exactly the same amount of time to run on every host, such a solution would run 
with a significantly lower rate of parallelism, not to mention it would be 
kludgy enough to be unpleasant to work with.
Unfortunately, Fabric is not thread safe, so the same type of limitations apply. 
All involved machines are assumed to be some sort of UNIX-like system like
OS X or Linux with python3, and Bourne style shell."""
Dynamo vs Cassandra
https://sujithjay.com/data-systems/dynamo-cassandra/
Dynamo vs Cassandra : Systems Design of NoSQL Databases

State-of-the-art distributed databases represent a distillation of 
years of research in distributed systems. The concepts underlying any 
distributed system can thus be overwhelming to comprehend. This is 
truer when you are dealing with databases without the strong 
consistency guarantee. Databases without strong consistency 
guarantees come in a range of flavours; but they are bunched under a 
category called NoSQL databases.
Google Spanner
(BigTable subtitution)
https://cloud.google.com/spanner/
- Features:
  - Schema
  - SQL
  - Consistency
  - Availability
  - Scalability
  - Replication
Document Stores
- also called document-oriented DBMS
- schema-free:
  - different records may have
    different columns
  - values of individual columns
    can have dif. types
- Columns can be multi-value
- Records can have a nested structure
- Often use internal notations,
  mostly JSON.
- features:
  - secondary indexes in (JSON) objects

ºExamplesº
- MongoDB
  www.infoq.com/articles/Starting-With-MongoDB
  14 Things I Wish I’d Known When Starting with
  MongoDB
- Amazon DynamoDB
- Couchbase
- Cosmos DB
- CouchDB
Data(log)
collect
ºFluentDº                                     |ºLogstashº
 "Improved" logstat                           |- TheºLº in ELK
 - https://www.fluentd.org/                   |-*OS data Collector
 - data collector for unified logging layer   |- TODO:
 - increasingly used Docker, GCP,             |  osquery.readthedocs.io/en/stable/
   and Elasticsearch communities              |- low-level instrumentation framework
 - https://logz.io/blog/fluentd-logstash      |  with system analytics and monitoring
   FluentD vs Logstash compared               |  both performant and intuitive.
ºFeaturesº                                    |- osquery exposes an operating system
 - unify data collection and consumption      |  as a high-performance SQL RDMS:
   for better understanding of data.          |  - SQL tables represent concepts such
                                              |    as running processes, loaded kernel
                                              |    modules, open network connections,
Syslog                      Elasticsearch     |    browser plugins, hardware events
Apache/Nginx logs    → → →  MongoDB           |ºFeatures:º
Mobile/Web app logs  → → →  Hadoop            |  - File Integrity Monitoring (FIM):
Sensors/IoT                 AWS, GCP, ...     |  - DNS
                                              |  - *1

*1: https://medium.com/palantir/osquery-across-the-enterprise-3c3c9d13ec55



|ºPrometheus Node ExporteRº                     |ºOthersº
|- https://github.com/prometheus/node_exporter  |- collectd
|- TODO:                                        |- Dynatrace OneAgent
                                                |- Datadog agent
                                                |- New Relic agent
                                                |- Ganglia gmond
                                                |- ...
Loki
@[https://grafana.com/loki]
- logging backend, optimized for Prometheus and Kubernetes
- optimized to search, visualize and explore your logs natively in Grafana.
ELK
@[https://opensource.com/article/18/9/open-source-log-aggregation-tools]
  - Elasticsearch, Logstash, and Kibana
  - developed and maintained by Elastic.
  - Elasticsearch:
    - essentially a NoSQL, Lucene search engine implementation.
      Extracted from: @[https://docs.sonarqube.org/latest/requirements/requirements/]
     """ ...the "data" folder houses the Elasticsearch indices on which a huge amount
         of I/O will be done when the server is up and running. Great read ⅋ write
         hard drive performance will therefore have a great impact on the overall
         server performance.
     """


  - Logstash: log pipeline (ingest/transform/load it into a store
              like Elasticsearch)
    (It is common to replace Logstash with Fluentd)
  - Kibana: visualization layer on top of Elasticsearch.
  - In productiona few other pieces might be included like:
    - Kafka, Redis, NGINX, ....

-ºElasticSearch cluster solutions in practice means:º
  - Deploy dozens of machines/containers.
  - Server tuning
  - mapping writing: i.e., - deciding how the data is:
    - tokenized
    - analyzed
    - indexed
    depending of user's requirements, internal limitations, and 
    available hardware resources.

Graylog
- log management tool based on Java, ElasticSearch and MongoDB.
   Graylog can be used to collect, index and analyze 
   any server log from a centralized location or distributed location. We can 
   easily monitor any unusual activity for debugging applications and logs using 
   Graylog. Graylog provides a powerful query language, alerting abilities, a 
   processing pipeline for data transformation and much more. We also can extend 
   the functionality of Graylog through a REST API and Add-ons.
  - gaining popularity in the Go community with
    the introduction of the Graylog Collector Sidecar
    written in Go.
  - ... it still lags far behind the ELK stack.
  - Composed  under the hood of:
    - Elasticsearch
    - MongoDB
    - Graylog Server.
  - comes with alerting built into the open source version
    and streaming, message rewriting, geolocation, ....
  - Ex:
  @[https://www.howtoforge.com/how-to-monitor-log-files-with-graylog-v31-on-debian-10/]


Fluentd
  - ºIt is not a log aggregation system.º
  - developed at Treasure Data,
  - Adopted by the CNCF as Incubating project.
  - Recommended by AWS and Google Cloud.
  - Common replacement for Logstash
    - It acts as a local aggregator to collect
      all node logs and send them off to central
      storage systems.
  - 500+ plugins for  quick and easy integrations with
    different data input/outputs.
  - common choice in Kubernetes environments due to:
    - low memory requirements (tens of megabytes)
      (each pod has a Fluentd sidecar)
    - high throughput.

Cerebro: admin GUI
for ElasticSearch
@[https://www.redhat.com/sysadmin/cerebro-webui-elk-cluster]
Elasticsearch comes as a set of blocks, and you—as a designer - are supposed
to glue them together. Yet, the way the software comes out of the box does not
cover everything. So, to me, it was not easy to see the cluster's heartbeat
all in one place. I needed something to give me an overview as well as allow me
to take action on basic things.
I wanted to introduce you to a helpful piece of software I found: Cerebro.

According to the Cerebro GitHub page:
Cerebro is an open source (MIT License) elasticsearch web admin tool built
using Scala, Play Framework, AngularJS, and Bootstrap.
Logreduce
IA filter
Logreduce IA filter
  - logreduce@pypi
  - Quiet log noise with Python and machine learning
Streams
Distributed (Event) Stream Processors
REF:
- Stream_processing@Wikipedia
- Patterns for streaming realtime anaylitics
- ???
- Streaming Reactive Systems & Data Pites w. Squbs
- Streaming SQL Foundations: Why I Love Streams+Tables 
- Next Steps in Stateful Streaming with Apache Flink
- Kafka Streams - from the Ground Up to the Cloud 
- Data Decisions with Real-Time Stream Processing
- Foundations of streamng SQL
- The Power of Distributed Snapshots in Apache Flink
- Panel: SQL over Streams, Ask the Experts
- Survival of the Fittest - Streaming Architectures
- Streaming for Personalization Datasets at Netflix
- < href="https://www.safaribooksonline.com/library/view/an-introduction-to/9781491934951/">An Introduction to Time Series with Team Apache 
""" Apache Cassandra evangelist Patrick McFadin shows how to solve time-series data
    problems with technologies from Team Apache: Kafka, Spark and Cassandra.
      - Kafka: handle real-time data feeds with this "message broker"
      - Spark: parallel processing framework that can quickly and efficiently
               analyze massive amounts of data
      - Spark Streaming: perform effective stream analysis by ingesting data
               in micro-batches
      - Cassandra: distributed database where scaling and uptime are critical
      - Cassandra Query Language (CQL): navigate create/update your data and data-models
      - Spark+Cassandra: perform expressive analytics over large volumes of data

|ºKafkaº@[./kafka_map.html]                             | ºSparkº (TODO)
|-@[https://kafka.apache.org]                           | @[http://spark.apache.org/]
|- scalable,persistent and fault-tolerant               | - Zeppelin "Spark Notebook"(video):
|  real-time log/event processing cluster.              | @[https://www.youtube.com/watch?v=CfhYFqNyjGc]
|- It's NOT a real stream processor  but a              |
|  broker/message-bus to store stream data              |
|  for comsuption.                                      |
|- "Kafka stream" can be use to add                     |
|  data-stream-processor capabilities                   |
|- Main use cases:                                      |
|  - real-time reliable streaming data                  | ºFlinkº TODO
|    pipelines between applications                     | - Includes a powerful windowing system
|  - real-time streaming applications                   |   supports many types of windows:
|    that transform or react to the                     |   - Stream windows and win.aggregations are crucial
|    streams of data                                    |     building block for analyzing data streams.
|- Each node-broker in the cluster has an               |
|  identity which can be used to find other             |
|  brokers in the cluster. The brokers also             |
|  need some type of a database to store                |
|  partition logs.                                      |
|-@[http://kafka.apache.org/uses]                       |
|  (Popular) Use cases                                  |
|  BºMessagingº: good replacement for traditional       |
|    message brokers decoupling processing from         |
|    data producers, buffering unprocessed messages,    |
|    ...) with better throughput, built-in              |
|    partitioning, replication, and fault-tolerance     |
|  BºWebsite Activity Trackingº(original use case)      |
|    Page views, searches, ... are published            |
|    to central topics with one topic per activity      |
|    type.                                              |
|  BºMetricsº operational monitoring data.              |
|  BºLog Aggregationº replacement.                      |
|    - Group logs from different servers in a central   |
|      place.                                           |
|    - Kafka abstracts away the details of files and    |
|      gives a cleaner abstraction of logs/events as    |
|      a stream of messages allowing for lower-latency  |
|      processing and easier support for multiple data  |
|      sources and distributed data consumption.        |
|      When compared to log-centric systems like        |
|      Scribe or Flume, Kafka offers equally good       |
|      performance, stronger durability guarantees due  |
|      to replication, and much lower end-to-end        |
|      latency.                                         |
|  BºStream Processingº: processing pipelines           |
|    consisting of multiple stages, where raw input     |
|    data is consumed from Kafka topics and then        |
|    aggregated/enriched/transformed into new           |
|    topics for further consumption.                    |
|    - Kafka 0.10.+ includes Kafka-Streams to           |
|      easify this pipeline process.                    |
|    - alternative open source stream processing        |
|      tools include Apache Storm and Samza.            |
|  BºEvent Sourcing architectureº: state changes        |
|    are logged as a time-ordered sequence of           |
|    records.                                           |
|  BºExternal Commit Log for distributed systemsº:      |
|    - The log helps replicate data between             |
|      nodes and acts as a re-syncing mechanism         |
|      for failed nodes to restore their data.          |

See also:
-Flink vs Spark Storm:
@[https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/]
Check also: Streaming Ledger. Stream ACID TXs on top of Flink

@[https://docs.confluent.io/current/schema-registry/index.html]
Confluent Schema Registry provides a serving layer for your metadata. It
provides a RESTful interface for storing and retrieving Apache Avro® schemas.
It stores a versioned history of all schemas based on a specified subject name
strategy, provides multiple compatibility settings and allows evolution of
schemas according to the configured compatibility settings and expanded Avro
support. It provides serializers that plug into Apache Kafka® clients that
handle schema storage and retrieval for Kafka messages that are sent in the
Avro format.


- LodDevice(Facebook)
@[https://www.infoq.com/news/2018/09/logdevice-distributed-logstorage]
  - LogDevice has been compared with other log storage systems
    like Apache BookKeeper and Apache Kafka.
  - The primary difference with Kafka
  (@[https://news.ycombinator.com/item?id=17975328]
    seems to be the decoupling of computation and storage
  - Underlying storage based on RocksDB, a key value store
    also open sourced by Facebook
Event Based
Architecture
@[https://www.infoq.com/news/2017/11/jonas-reactive-summit-keynote]

Jonas Boner ... talked about event driven services (EDA) and
event stream processing (ESP)... on distributed systems.

... background on EDA evolution over time:
- Tuxedo, Terracotta and Staged Event Driven Architecture (SEDA).

ºevents represent factsº
- Events drive autonomy in the system and help to reduce risk.
- increase loose coupling, scalability, resilience, and traceability.
- ... basically inverts the control flow in the system
- ... focus on the behavior of systems as opposed to the
  structure of systems.

- TIP for developers:
 ºDo not focus on just the "things" in the systemº
 º(Domain Objects), but rather focus on what happens (Events)º
- Promise Theory:
  - proposed by Mark Burgess
  - use events to define the Bounded Context through
    the lense of promises.

quoting Greg Young:
"""Modeling events forces you to have a temporal focus on what’s going on in the
system. Time becomes a crucial factor of the system."""
""" Event Logging allows us to model time by treating event as a snapshot in time
   and event log as our full history. It also allows for time travel in the
   sense that we can replay the log for historic debugging as well as for
auditing and traceability. We can replay it on system failures and for data replication."""

Boner discussed the following patterns for event driven architecture:
- Event Loop
- Event Stream
- Event Sourcing
- CQRS for temporal decoupling
- Event Stream Processing

Event stream processing technologies like Apache Flink, Spark Streaming,
Kafka Streams, Apache Gearpump and Apache Beam can be used to implement these
design patterns.

DDDesign, CQRS ⅋ Event Sourcing
Domain Driven Design, CQRS and Event Sourcing
@[https://www.kenneth-truyers.net/2013/12/05/introduction-to-domain-driven-design-cqrs-and-event-sourcing/]
Text indexing
Search
ºElasticsearchº
@[https://www.elastic.co/products/elasticsearch]
- distributed, RESTful (all-document-type) search
  and analytics engine
- Implemented on top of Lucene
- Developed alongside a data-collection and
  log-parsing engine called Logstash, and the
  analytics and visualisation platform  Kibana.
  - The three products form the "Elastic Stack"
    (formerly the "ELK stack"
  - At the heart of the ELK Stack, data is
    stored centrally
Solr
@[http://lucene.apache.org/solr/]
- blazing-fast, search platform built on Apache Lucene


Enterprise Patterns
ESB Architecture
- Can be defined by next feautes:
  - Monitoring of services/messages passed between them
  - wire Protocol bridge between HTTP, AMQP, SOAP, gRPC, CVS in Filesystem,...
  - Scheduling, mapping, QoS management, error handling, ..
  - Data transformation
  - Data pipelines
  - Mule, JBoss Fuse (Camel + "etc..."), BizTalk, Apache ServiceMix, ...
REF: https://en.wikipedia.org/wiki/Enterprise_service_bus#/media/File:ESB_Component_Hive.png
    ^   Special App. Services
    |
E   |   Process Automation                 BPEL, Workflow
n   |
t   |   Application Adapters               RFC, BABI, IDoc, XML-RPC, ...
e m |
r e |   Application Data Consolidation     MDM, OSCo, ...
p s |
r s |   Application Data Mapping           EDI, B2B
i a |   _______________________________
s g |   Business Application Monitoring
e e |   _______________________________
    |   Traffic Monitoring Cockpit
S c |
e h |   Special Message Services           Ex. Test Tools
r a |
v n |   Web Services                       WSDL, REST, CGI
i n |
c e |   Protocol Conversion                XML, XSL, DCOM, CORBA
e l |
    |   Message Consolidation              N.N (data locks, multi-submit,...)
B   |
u   |   Message Routing                    XI, WBI, BIZTALK, Seeburger
s   |
    |   Message Service                    MQ Series, MSMQ, ...
Business Data charts
@[https://www.forbes.com/sites/bernardmarr/2017/07/20/the-7-best-data-visualization-tools-in-2017/#643a48726c30]


Low-Level Data charts
|ºKibanaº (TODO)
|@[https://www.elastic.co/products/kibana]
|- "A Picture's Worth a Thousand Log Lines"
|- visualize (Elasticsearch) data and navigate
|  the Elastic Stack, learning  understanding
|  the impact rain might have on your quarterly
|  numbers

ºGrafanaº (TODO)
@[https://grafana.com/]
- time series analytics
- Integrates with Prometheus queries
@[https://logz.io/blog/grafana-vs-kibana/]
- 7.0+ includes new plugin architecture, visualizations,
  transformations, native trace support, ... 
https://grafana.com/blog/2020/05/18/grafana-v7.0-released-new-plugin-architecture-visualizations-transformations-native-trace-support-and-more/ 
Messaging
Summary
Messaging traditionally has two models:
ºqueuingº
  - a pool of consumers may read from a server and
    each record goes to one of them;
  - Pros: allows to divide up the processing of data
        over multiple consumer instances, which
        lets you scale your processing.
  - Cons: queues aren't multi-subscriber—once
        one process reads the data it's gone.

ºpublish-subscribeº
  - the record is broadcast to all consumers.
  - Pros: let broadcast data to multiple processes,
  - Cons: no way of scaling processing since every message
          goes to every subscriber


ºMessage Queuesº
Defined by
- message oriented architecture
- Persistence (or durability until comsuption)
- queuing
- Routing: point-to-point / publish-and-subscribe
- No processing/transformation of message/data

ºImplementations and standardsº
- AMQP
@[https://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol]
  - Open standard network protocol
  - Often compared to JMS:
    - JMS defines API interfaces, AMQP defines network protocol
    - JMS has no requirement for how messages are formed and
      transmitted and thus every JMS broker can implement the
      messages in a different (incompatible) format.
      AMQP publishes its specifications in a downloadable XML format,
      allowing library maintainers to generate APIs driven by
      the specs while also automating construction of algorithms to
      marshal and demarshal messages.
  - brokers implementations supporting it:
    RabbitMQ, ActiveMQ, Qpid, Solace, ...
    @[https://www.amqp.org/about/examples]
    - Apache Qpid (TODO)
    @[http://qpid.apache.org/]
    - INETCO's AMQP protocol analyzer
    @[http://www.inetco.com/resource-library/technology-amqp/]
    - JORAM  JMS + AMPQ
    @[http://joram.ow2.org/] 100% pure Java implementation of JMS
    - Kaazing's AMQPºWeb Clientº
    @[http://kaazing.net/index.html]
    - Azure Service Bus+ AMPQ
- What's wrong with AMQP?
@[https://news.ycombinator.com/item?id=1657574]
- @[http://www.windowsazure.com/en-us/develop/net/how-to-guides/service-bus-amqp-overview/]
- JBoss A-MQ, built from @[http://qpid.apache.org/]
@[http://www.redhat.com/en/technologies/jboss-middleware/amq]
- IBM MQLight
@[https://developer.ibm.com/messaging/mq-light/]
- StormMQ
@[http://stormmq.com/]
  a cloud hosted messaging service based on AMQP
- RabbitMQ (by VMware Inc)
@[http://www.rabbitmq.com/];
  also supported by SpringSource
...
- Java JMS
Message Brokers
- Routing
- (De-)Multiplexing of messages from/into multiple messages to different recipients
- Durability
- Transformation (translation of message between formats)
- "things usually get blurry - many solutions are both (message queue and message
  broker) - for example RabbitMQ or QDB.  Samples for message queues are
  Gearman, IronMQ, JMS, SQS or MSMQ."
  Message broker examples are, Qpid, Open AMQ or ActiveMQ.
- Kafka can also be used as message broker but is not its main intention
OpenHFT
microSec Messaging
storing to disk
https://github.com/OpenHFT/
   - Chronicle-Queue: Micro second messaging that stores everything to disk
   - Chronicle-Accelerate: HFT meets Blockchain in Java platform
         XCL is a new cryptocurrency project that, learning from the previous
         Blockchain implementations, aims to solve the issues limiting adoption
         by building an entirely new protocol that can scale to millions of
         transactions per second, delivering consistent sub-second latency. Our
         platform will leverage AI to control volatility and liquidity, require
         low energy and simplify compliance with integrated KYC and AML support.

         The XCL platform combines low latencies (sub millisecond), IoT transaction
         rates (millions/s), open source AI volatility controls and blockchain for
         transfer of value and exchange of value for virtual fiat and crypto
         currencies. This system could be extended to other asset classes such as
         securities and fixed income. It uses a federated services model and
         regionalized payment systems making it more scalable than a blockchain
         which requires global consensus.

         The platform makes use of Chronicle-Salt for encryption and Chronicle-Bytes.

         on Chronicle Core’s direct memory and OS system call access.



   - Chronicle-Logger: A sub microsecond java logger, supporting standard logging
                      APIs such as Slf & Log4J

Apache NiFi
Data Routing
@[https://nifi.apache.org]
- Web-GUI data route+transform
- scalable directed graphs of data routing,
  transformation, and system mediation logic.
- Seamless experience between design, control,
  feedback, and monitoring
- Highly configurable:
  - Loss tolerant vs guaranteed delivery
  - Low latency vs high throughput
  - Dynamic prioritization
  - Flow can be modified at runtime
  - Back pressure
- Data Provenance
  - Track dataflow from beginning to end
- Designed for extension
  - Build your own processors and more
  - Enables rapid development and
    effective testing
- Secure
  - SSL, SSH, HTTPS, encrypted content, etc...
  - Multi-tenant authorization and internal
    authorization/policy management
REFS:
- NiFi+Spark:
@[https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark"]
Distributed cache
(Extracted from whitepaper by Christoph Engelbert, Soft.Arch.at Hazelcast)
cache hit: data is already available in the cache when requested
           (otherwise it's said cache miss)

- Caches are implemented as simple key-value stores for performance.
- Caching-First: term to describe the situation where you start thinking
       about Caching itself as one of the main domains of your application.

ºUssageº
-ºReference Dataº
  - normally small and used to speed up the dereferencing
    of previously known, fixed number of elements (e.g.
    states of the USA, abbreviations of elements,...).
-ºActive DataSetº
- Grow to their maximum size and evict the oldest or not
  frequently used entries to keep in memory bounds.

Caching Strategies
ºCooperative (Distributed) cachingº
  different  cluster-nodes work together to build a huge, shared cache
  Ussually an "intelligent" partitioning algorithm is used to balance
  load about cluster nodes.
  - common approach when system requires large amounts of data to be cached

ºPartial Cachingº
- not all data is stored in the cache.

ºGeographical Cachingº
- located in chosen locations to optimize latency
- CDN (Content Delivery Network) is the best known example of this type of cache
- Works well when  content changes less often.

ºPreemptive Cachingº
- mostly used in conjunction with a Geographical Cache
- Using a warm-up engine a Preemptive Cache is populated on startup
  and tries to update itself based on rules or events.
- The idea behind this cache addition is to reload data from any
  backend service or central cluster even before a requestor wants
  to retrieve the element. This keeps access time to the cached
  elements constant and prevents accesses to single elements from
  becoming unexpectedly long.
- Can be difficult to implement properly and requires a lot of
  knowledge of the cached domain and the update workflows

ºLatency SLA Cachingº
- It's able to maintain latency SLAs even if the cache is slow
  or overloaded. This type of cache can be build in two different ways.
  - Having a timeout to exceed before the system either requests
    the potentially cached element from the original source
    (in parallel to the already running cache request) or simple
    default answer, using whatever returns first.
  - Always fire both requests in parallel and take whatever returns first.
    (discouraged since it mostly dimiss the value of caching). Can make
    sense if multiple caching layers are available.

Caching Topologies
-ºIn-process:º
 - cache share application's memory space.
 - most oftenly used in non-distributed systems.
 - fastest possible access speed.
 - Easy to build, but complex to grow.

-ºEmbedded Node Cachesº
- the application itself will be part of the cluster.
- kind of combination between an In-Process Cache and the
  Cooperative Caching
- it can either use partitioning or full dataset replication.

- CONST: Application and cache cannot be scaled independently

-ºClient-Server Cachesº
- these systems tend to be Cooperative Caches by having a
  multi-server architecture to scale out and have the
  same feature set as the Embedded Node Caches
  but with the client layer on top.
- This architecture keeps separate clusters of the applications
  using the cached data and the data itself, offering
  the possibility to scale the application cluster and the
  caching cluster independently.

Evict Strategies
-ºLeast Frequently usedº
 - values that are accessed the least amount of times are
   remove on memory preasure.
 - each cache record must keep track of its accesses using
   a counter which is increment only.

-ºLeast Recently Usedº
 - values that were last used most far back in terms of time
   are removed on memory preasure.
 - each record keeps must track of its last access timestamp
Other evict strategies can be found at
@[https://en.wikipedia.org/wiki/Cache_replacement_policies]

Memcached
@[https://www.memcached.org/
- distributed memory object caching system
- Memcached servers are unaware of each other. There is no crosstalk, no
  syncronization, no broadcasting, no replication. Adding servers increases
  the available memory. Cache invalidation is simplified, as clients delete
  or overwrite data on the server which owns it directly
- initially intended to speed up dynamic web applications alleviating database load

ºMemcached-session-managerº
@[https://github.com/magro/memcached-session-manager]
  tomcat HA/scalable/fault-tolerant session manager
- supports sticky and non-sticky configurations
- Failover is supported via migration of sessions
@[https://www.infoworld.com/article/3063161/why-redis-beats-memcached-for-caching.html]
Redis
- in-memory data structure store, used as a key-value database and cache
- Since it can also notify listener of changes in its state it can
  also be used as message broker (this is the case for example
  in Kubernetes, where etcd implement an asynchronous message system
  amongst its componentes).
- supports data structures such as strings, hashes, lists, sets, sorted sets
  with range queries, bitmaps, hyperloglogs and geospatial indexes with
  radius queries
- Redis has built-in replication, Lua scripting, LRU eviction, transactions
  and different levels of on-disk persistence, and provides high availability
  via Redis Sentinel and automatic partitioning with Redis Cluster
@[https://www.infoworld.com/article/3063161/why-redis-beats-memcached-for-caching.html]
@[https://www.infoq.com/news/2018/10/Redis-5-Released]
Hazelcast
in-memory
data grid
@[https://en.wikipedia.org/wiki/Hazelcast]
-  based on Java
Ehcache
(terabyte)
cache
@[http://www.ehcache.org/]
- Can be used as tcp service (distributed cache) or process-embedded
  TODO:  Same API for local and distributed objects?
- open source, standards-based cache that boosts performance, offloads I/O
- Integrates with other popular libraries and frameworks
- It scales from in-process caching, all the way to mixed
  in-process/out-of-process deployments withºterabyte-sized cachesº

ºExample Ehcache 3 APIº:
CacheManager cacheManager =
  CacheManagerBuilder.newCacheManagerBuilder()
  .withCache("preConfigured",
    CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
      ResourcePoolsBuilder.heap(100))
  .build())
  .build(true);

Cache˂Long, String˃ preConfigured =
    = cacheManager.getCache("preConfigured", Long.class, String.class);

Cache˂Long, String˃ myCache = cacheManager.createCache("myCache",
    CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
                                  ResourcePoolsBuilder.heap(100)).build());

myCache.put(1L, "da one!");
String value = myCache.get(1L);

cacheManager.close();

(simpler/lighter solution but not so escalable could be to use Google Guava Cache)
JBoss Cache
@[http://jbosscache.jboss.org/]
Distributed Storage
Distributed DB
Eventual/Strong
Consistency
@[https://hackernoon.com/eventual-vs-strong-consistency-in-distributed-databases-282fdad37cf7]
UStore!!!
@[https://arxiv.org/pdf/1702.02799.pdf]
Distributed Storage with rich semantics!!!

Today’s storage systems expose abstractions which are either  too  low-level
(e.g.,  key-value  store,  raw-blockstore)   that   they   require   developers
  to   re-invent   thewheels, or too high-level (e.g., relational databases,
Git)that they lack generality to support many classes of ap-plications.   In
this  work,  we  propose  and  implement  ageneral  distributed  data  storage
system,  called  UStore,which  has  rich  semantics.    UStore  delivers  three
 keyproperties,  namely  immutability,  sharing  and  security,which unify and
add values to many classes of today’sapplications, and which also open the
door for new ap-plications.   By  keeping  the  core  properties  within
thestorage, UStore helps reduce application development ef-forts while offering
high performance at hand. The stor-age embraces current hardware trends as key
enablers.It is built around a data-structure similar to that of Git,a  popular
source  code  versioning  system,  but  it  alsosynthesizes many designs from
distributed systems anddatabases.   Our  current  implementation  of  UStore
hasbetter  performance  than  general  in-memory  key-valuestorage systems,
especially for version scan operations.We  port  and  evaluate  four
applications  on  top  of  US-tore:  a Git-like application, a collaborative
data scienceapplication,  a transaction management application,  anda
blockchain application.   We demonstrate that UStoreenables  faster
development  and  the  UStore-backed  ap-plications can have better performance
than the existingimplementations

""" Our current implementation of UStore hasbetter performance than 
general in-memory key-valuestorage systems, especially for version 
scan operations.We port and evaluate four applications on top of 
US-tore: a Git-like application, a collaborative data 
scienceapplication, a transaction management application, anda 
blockchain application. We demonstrate that UStore enables faster 
development and the UStore-backed applications can have better 
performance than the existing implementations. """
Data Lake
@[https://en.wikipedia.org/wiki/Data_lake]
NFS
considered
harmful
@[http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html]
cluster-FS
comparative
@[http://zgp.org/linux-tists/20040101205016.E5998@shaitan.lightconsulting.com.html]
Ceph
Up-to-exabytes
@[http://ceph.com/ceph-storage/]
Ceph’s RADOS provides you with extraordinary data storage  scalability—
thousands of client hosts or KVMs accessing petabytes to
exabytes of data. Each one of your applications can use the object, block or
file system interfaces to the same RADOS cluster simultaneously, which means
your Ceph storage system serves as a  flexible foundation for all of your
data storage needs. You can use Ceph for free, and deploy it on economical
commodity hardware. Ceph is a  better way to store data.

By decoupling the namespace from the underlying hardware, object-based
storage systems enable you to build much larger storage clusters. You
can scale out object-based storage systems using economical commodity hardware
, and you can replace hardware easily when it malfunctions or fails.

Ceph’s CRUSH algorithm liberates storage clusters from the scalability and
performance limitations imposed by centralized data table mapping. It
replicates and re-balance data within the cluster dynamically—elminating this
tedious task for administrators, while delivering high-performance and
infinite scalability.

See more at: http://ceph.com/ceph-storage/#sthash.KNp2tGf5.dpuf
GlusterFS
Tachyon
memory-centric
 distributed FS
@[http://tachyon-project.org/index.html]
- memory-centric distributed file system enabling reliable file sharing at memory-speed
across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging
lineage information and using memory aggressively. Tachyon caches working set files in memory,
thereby avoiding going to disk to load datasets that are frequently read. This enables different
jobs/queries and frameworks to access cached files at memory speed.

Tachyon is Hadoop compatible. Existing Spark and MapReduce programs can run on top of it without
any code change. The project is open source (Apache License 2.0) and is deployed at multiple companies.
It has more than 40 contributors from over 15 institutions, including Yahoo, Intel, and Redhat.

The project is the storage layer of the Berkeley Data Analytics Stack (BDAS) and also part of the
Fedora distribution.
S3QL
FUSE FS
(Amazon S2, GCS, 
OpenStack,...)
- FUSE-based file system
- backed by several cloud storages:
  - such as Amazon S2, Google Cloud Storage, Rackspace CloudFiles, or OpenStack
@[http://xmodulo.com/2014/09/create-cloud-based-encrypted-file-system-linux.html]
- S3QL is one of the most popular open-source cloud-based file systems.
- full featured file system:
  - unlimited capacity
  - up to 2TB file sizes
  - compression
  - UNIX attributes
  - encryption
  - snapshots with copy-on-write
  - immutable trees
  - de-duplication
  - hardlink/symlink support, etc.
- Any bytes written to an S3QL file system are compressed/encrypted
  locally before being transmitted to cloud backend.
- When you attempt to read contents stored in an S3QL file system, the
  corresponding objects are downloaded from cloud (if not in the local
  cache), and decrypted/uncompressed on the fly.
Minio.io
@[https://minio.io]
- Private Cloud Storage
- high performance distributed object storage server, designed for
  large-scale private cloud infrastructure. Minio is widely deployed across the
  world with over 146.6M+ docker pulls.
Why Object storage?
@[https://www.ibm.com/developerworks/library/l-nilfs-exofs/index.html]

Object storage is an interesting idea and makes for a much more scalable
system. It removes portions of the file system from the host and pushes them
into the storage subsystem. There are trade-offs here, but by distributing
portions of the file system to multiple endpoints, you distribute the workload,
making the object-based method simpler to scale to much larger storage systems.
Rather than the host operating system needing to worry about block-to-file
mapping, the storage device itself provides this mapping, allowing the host to
operate at the file level.

Object storage systems also provide the ability to query the available
metadata. This provides some additional advantages, because the search
capability can be distributed to the endpoint object systems.

Object storage has made a comeback recently in the area of cloud storage. Cloud
storage providers (which sell storage as a service) represent their storage as
objects instead of the traditional block API. These providers implement APIs
for object transfer, management, and metadata management.
Distributed Tracing
Zipkin
@[https://zipkin.io/]
Genesis Distributed Testing
@[https://docs.whiteblock.io/introduction_to_testing.html]
The following are types of tests one can perform on a distributed system:
- Functional Testing is conducted to test whether a system performs as it was 
  specified or in accordance with formal requirements
- Performance Testing tests the reliability and responsiveness of a system 
  under different types of conditions and scenarios
- Penetration Testing tests the system for security vulnerabilities
- End-to-End Testing is used to determine whether a system’s process flow 
  functions as expected
- Fuzzing is used to test how a system responds to unexpected, random, or 
  invalid data or inputs

Genesis is a versatile testing platform designed to automate the tests listed 
above, making it faster and simpler to conduct them on distributed systems 
where it was traditionally difficult to do so. Where Performance, End-to-End, 
and Functional testing comprise the meat of Genesis’ services, other types of 
testing are enabled through the deployment of services and sidecars on the 
platform.

End-to-End tests can be designed by applying exit code checks for process 
completion, success, or failure in tasks and phases, while Performance tests 
can be conducted by analyzing data from tests on Genesis that apply a variety 
of network conditions and combinations thereof. Functional tests can use a 
combination of tasks, phases, supplemental services and sidecars, and network 
conditions, among other tools.

These processes and tools are further described in this documentation.

AAA
keycloak
@[https://www.keycloak.org/]
"""Add authentication to applications and secure services with minimum fuss"""
-  No need to deal with storing users or authenticating users.
- It's all available out of the box.
- Advanced features such as User Federation, Identity Brokering and Social Login.

   Single-Sign On                        LDAP and Active Directory
   Login once to multiple applications   Connect to existing user directories
   Standard Protocols                    Social Login
   OpenID Connect, OAuth 2.0             Easily enable social login
   and SAML 2.0

   Identity Brokering                    Clustering
   OpenID Connect or SAML 2.0 IdPs       For scalability and availability
   High Performance                      Themes
   Lightweight, fast and scalable        Customize look and feel

   Extensible
   Customize through code
   Password Policies
   Customize password policies
FreeIPA
@[https://www.reddit.com/r/linuxadmin/comments/apbjtc/freeipa_groups_and_linux_usernames/]
Non-classified
TOGAF
@[https://en.wikipedia.org/wiki/The_Open_Group_Architecture_Framework]
The Open Group Architecture Framework (TOGAF) is a framework for enterprise
architecture that provides an approach for designing, planning, implementing,
and governing an enterprise information technology architecture.[2] TOGAF is
a high level approach to design. It is typically modeled at four levels:
Business, Application, Data, and Technology. It relies heavily on
modularization, standardization, and already existing, proven technologies and
products.

TOGAF was developed starting 1995 by The Open Group, based on DoD's TAFIM. As
of 2016, The Open Group claims that TOGAF is employed by 80% of Global 50
companies and 60% of Fortune 500 companies.[3]
API Management
@[https://www.datamation.com/applications/top-api-management-tools.html]

Behaviour
Driven
Development
(BDD)
- Cucumber, ...
@[https://cucumber.io/docs/guides/10-minute-tutorial/]
End-to-End Request Tracing
https://www.jaegertracing.io/
https://logz.io/blog/zipkin-vs-jaeger/
Request tracing is the ultimate insight tool. Request tracing tracks operations
inside and across different systems. Practically speaking, this allows
engineers to see the how long an operation took in a web server, database,
application code, or entirely different systems, all presented along a
timeline. Request tracing is especially valuable in distributed systems where a
single transaction (such as “create an account”) spans multiple systems

Problems that Jaeger addresses:
- distributed transaction monitoring
- performance and latency optimization
- root cause analysis
- service dependency analysis
- distributed context propagation
Nexenta
@[https://nexenta.com/]
- Software-Defined Storage Product Family.
Disruptor
  Disruptor
  See also
  lmax-exchange.github.com

  When Disruptor is not good fit
GraphQL
GraphQL: Facebook-developed language that provides a powerful API to get only
the dataset you need in a single request, seamlessly combining data sources.

REF: https://www.redhat.com/en/topics/api/what-is-graphql
The purpose of this monorepo is to give the GraphQL Community:
    a to-specification official language service (see: API Docs)
    a comprehensive LSP server and CLI service for use with IDEs
    a codemirror mode
    a monaco mode (in the works)
    an example of how to use this ecosystem with GraphiQL.
    examples of how to implement or extend GraphiQL.


ApolloGraphQL
Simplify app development by combining APIs, databases, and microservices into a
single data graph that you can query with GraphQL
@[https://www.apollographql.com/]
Big Data
Ambari
Hadoop cluster
provisioning
@[https://projects.apache.org/project.html?ambari]
Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple.
Spark
@[http://spark.apache.org/]
- Speed
  - Run workloads 100x faster.
  - Apache Spark achieves high performance for both batch
    and streaming data, using a state-of-the-art DAG scheduler,
    a query optimizer, and a physical execution engine.

- Ease of Use
  - Write applications quickly in Java, Scala, Python, R, and SQL.
  - Spark offers over 80 high-level operators that make it easy to
    build parallel apps. And you can use it interactively from the
    Scala, Python, R, and SQL shells.
    - Example PythonBºDataFrame APIº.
      |Bºdfº=ºspark.read.jsonº("logs.json") ← automatic schema inference
      |Bºdfº.where("age ˃ 21")
      |    .select("name.first").show()


- Generality
  - Combine SQL, streaming, and complex analytics.
  - Spark powers a stack of libraries including SQL and DataFrames,
    MLlib for machine learning, GraphX, and Spark Streaming.
    You can combine these libraries seamlessly in the same application.

- Runs Everywhere
  - Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone,
    or in the cloud. It can access diverse data sources.

- You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN,
  on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra,
  Apache HBase, Apache Hive, and hundreds of other data sources.

- Common applications for Spark include real-time marketing campaigns, online
  product recommendations, cybersecurity analytics and machine log monitoring.

Hadoop "vs" Spark
@[https://www.infoworld.com/article/3014440/big-data/five-things-you-need-to-know-about-hadoop-v-apache-spark.html]
Hadoop is essentially a distributed data infrastructure:
 -It distributes massive data collections across multiple nodes
  within a cluster of commodity servers
 -It also indexes and keeps track of that data, enabling
  big-data processing and analytics far more effectively
  than was possible previously.
Spark, on the other hand, is a data-processing tool that operates on those
distributed data collections; it doesn't do distributed storage.

You can use one without the other:
  - Hadoop includes not just a storage component, known as the
  Hadoop Distributed File System, but also a processing component called
  MapReduce, so you don't need Spark to get your processing done.
  - Conversely, you can also use Spark without Hadoop. Spark does not come with
  its own file management system, though, so it needs to be integrated with one
  - if not HDFS, then another cloud-based data platform. Spark was designed for
  Hadoop, however, so many agree they're better together.

Spark is generally a lot faster than MapReduce because of the way it processes
data. While MapReduce operates in steps, Spark operates on the whole data set
in one fell swoop:
   "The MapReduce workflow looks like this: read data from the cluster, perform
    an operation, write results to the cluster, read updated data from the
    cluster, perform next operation, write next results to the cluster, etc.,"
    explained Kirk Borne, principal data scientist at Booz Allen Hamilton.
    Spark, on the other hand, completes the full data analytics operations
    in-memory and in near real-time:
    "Read data from the cluster, perform all of the requisite analytic
    operations, write results to the cluster, done," Borne said.
Spark can be as much as 10 times faster than MapReduce for batch processing and
p to 100 times faster for in-memory analytics, he said.
  You may not need Spark's speed. MapReduce's processing style can be just fine
if your data operations and reporting requirements are mostly static and you
can wait for batch-mode processing. But if you need to do analytics on
streaming data, like from sensors on a factory floor, or have applications that
require multiple operations, you probably want to go with Spark.
 Most machine-learning algorithms, for example, require multiple operations.

Recovery: different, but still good.
Hadoop is naturally resilient to system faults or failures since data
are written to disk after every operation, but Spark has similar built-in
resiliency by virtue of the fact that its data objects are stored in something
called resilient distributed datasets distributed across the data cluster.
"These data objects can be stored in memory or on disks, and RDD provides full
recovery from faults or failures," Borne pointed out.
Hive
Data W.H. on top of Hadoop
@[https://projects.apache.org/project.html?hive]
- querying and managing utilities for large datasets
  residing in distributed storage built on top of Hadoop
- easy data extract/transform/load (ETL)
 * a mechanism to impose structure on a variety of data formats*
- access to files stored in Apache HDFS and other data storage
  systems such as Apache HBase (TM)

HiveMQ Broker: MQTT+Kafka for IoT
@[https://www.infoq.com/news/2019/04/hivemq-extension-kafka-mqtt/]

Data Science
Orange
@[https://orange.biolab.si/screenshots/]
Features:
- Interactive Data Visualization
- Visual Programming
- Student's friendly.
- Add-ons
- Python Anaconda Friendly
  $ conda config --add channels conda-forge
  $ conda install orange3
  $ conda install -c defaults pyqt=5 qt
- Python pip  Friendly
  $ pip install orange3
Example Architectures
observability: 
      loging
+ monitoring
+    tracing
Ex:ºFile Integrity Monitoring at scale: (RSA Conf)º
@[https://www.rsaconference.com/writable/presentations/file_upload/csv-r14-fim-and-system-call-auditing-at-scale-in-a-large-container-deployment.pdf]
Auditing log to gain insights at scale:

                         ┌─→ Pagerduty
            ┌─→ Grafana ─┼─→ Email
   Elastic  │            └─→ Slack
   Search  ─┼─→ Kibana
            │
            └─→ Pre─processing ─→ TensorFlow

Alt1:
  User   │ go-audit-                          User space
  land   │ container                             app
  ───────├─────  Netlink ───── Syscall iface ───────────
  Kernel │        socket           ^
         │          ^              |
                    └─  Kauditd ───┘
Loggin@Coinbase
@[https://www.infoq.com/news/2019/02/metrics-logging-coinbase]
Adidas Reference Architecture
Source @[https://adidas.github.io]

ºuser interfaceº

Working with the latest JavaScript technologies as well as building a stable
ecosystem is our goal. We are not only using one of many available frameworks,
we test them, we analyze our necessities and we choose what has the best
balance between productivity and developer satisfaction.

adidas is moving most of the applications from the desktop to the browser, and
we are improving every day to achieve it in the less time possible:
standardization, good practices, continuous deployment, automated tasks are
part of our daily work.

Our stack has changed a lot since the 2015. Back then we used code which is
better not to speak about. Now, we power our web applications with frameworks
like React and Vue and modern tooling. Our de facto tool for shipping these
applications is webpack.

We love how JavaScript is changing everyday and we encourage the developers to
use its latest features. TypeScript is also a good choice for developing
consistent, highly maintainable applications.


ºapiº
Apiary is our collaborative platform for designing, documenting and governing
API's. It helps us to adopt API first approach with templates and best
practices. Platform is strongly integrated with our API guidelines and helps us
to achieve consistency of all our API's. Apiary also speeds up and simplifies
the development and testing phase by providing API mock services and automation
of API testing.

Mashery is an API platform that delivers key API management capabilities needed
for successful digital transformation initiatives. Its full range of
capabilities includes API creation, packaging, testing, and management of your
API's and community of users. API security is provided via an embedded or
optional on-premise API gateway.

Runscope is our continuous monitoring platform for all our API's we are
exposing. It monitors uptime, performance and validates the data. It is tightly
integrated with Slack and Opsgenie in order to alert and create incident if
things are going wrong way. Using Runscope we can detect any issues with our
API's super fast and act accordingly to achieve highest possible SLA.

ºfast dataº
Apache Kafka is the core of our Fast Data Platform.
The more we work with it, the more versatile we perceived it was, covering more
use cases apart of event sourcing strategies.

It is designed to implement the pub-sub pattern in a very efficient way,
enabling Data Streaming cases, where multiple can easily subscribe to the same
stream of information. The fan-out is the pub-sub pattern is implemented in a
really efficient way, allowing to achieve high thought in the production and
the consumption part.

It also enables other capabilities like Data Extraction and Data Modeling, as
long as Stateful Event Processing. Together with Storm, Kafka Streams and the
Elastic stack, provides the perfect toolset to integrate our digital
applications in a modern self-service way.

ºacidº
Jenkins has become the de facto standard for continuous integration/continuous
delivery projects.
ACID is a 100% docker powered by Kubernetes that allows you to create a Jenkins
as a Service platform. ACID provides an easy way to generate an isolated
Jenkins environment per team while keeping a shared basic configuration and a
central management dashboard.
Updating and maintaining teams instances have never been so easy. Team freedom
provided out of the box, gitops disaster recovery capabilities and elastic
slaves allocation are the key pillars.

ºtestingº
SerenityBDD is our default for automation in test solution. Built on top of
Selenium and Cucumber, it enables us to write cleaner and more maintainable
automated acceptance and regression tests in a faster way. With archetypes
available for both frontend (web) and backend (API), it is almost
straightforward to be implemented in most projects.

Supporting a wide type of protocols and application types, Neoload is our tool
of choice for performance testing. It provides an easy to use interface with
powerful recording capabilities, allowing manual implementation as well if the
user chooses. Since it supports dockerized load generators, it makes it easy to
create a shared infrastructure and integrate it in our continuous integration
chain.

ºmonitoringº
From the monitoring perspective, we are using different tools trying to provide
internally end-to-end monitoring, for a better troubleshooting.

For system and alerting monitoring, we are using Prometheus, an open source
solution, easy for developers and completely integrated with Kubernetes. For
infrastructure monitoring we have Icinga with hundreds of custom scripts and
graphite as timeseries database. About APM, Instana is our solution, which
makes easy and fast to monitor the applications and identify bottlenecks. We
use Runscope to test and monitor our APIs with continuous schedule to give us
completely visibility into API problems.

Grafana is the visualization tool by default which is integrated with
Prometheus, Instana, Elasticsearch, Graphite, etc.

For logging, we are based on the ELK stack running on-premises and AWS
Kubernetes, with custom components for processing and alerting being able to
notify the problems to the teams.

As the goal of monitoring is the troubleshooting, all these tools have
integration with Opsgenie, where we have defined escalation policies for
alerting and notifications. The incidents can be communicated to the final
users via Statuspage.

ºmobileº
At mobile development, we run out of hybrid solutions. Native is where we live.

We work with the latest and most concise, expressive and safe languages
available at the moment for android and iOS mobile phones: Kotlin and Swift.

Our agile mobile development process requires big doses of automation to test,
build and distribute reliably and automatically every single change in the app.
We rely on tools such as Jenkins and Fastlane to automate those steps.
100% Serverless AWS arch
What a typical 100% Serverless Architecture looks like in AWS!
https://medium.com/serverless-transformation/what-a-typical-100-serverless-architecture-looks-like-in-aws-40f252cd0ecb 
Facebook
netconsole monit@scale
@[http://www.serverwatch.com/server-news/linuxcon-how-facebook-monitors-hundreds-of-thousands-of-servers-with-netconsole.html]
"""""" Facebook had a system in the past for monitoring
that used syslog-ng, but it was less than 60 percent reliable.
In contrast, Owens stated netconsole is highly scalable and can
handle enormous log volume with greater than 99.99 percent
reliability.  """""""
Badoo
20 billion Events/day
@[https://www.infoq.com/news/2019/08/badoo-20-billion-events-per-day/]
Non-Ordered
When Ceph is Not Enough
Josh Goldenhar, VP Product Marketing, Lightbits Labs
: There’s a new kid on the “block”
About this webinar
In this 30-minute webinar, we'll discuss the origins of Ceph and why 
it's a great solution for highly scalable, capacity optimized storage 
pools. You’ll learn how and where Ceph shines but also where its 
architectural shortcomings make Ceph a sub-optimal choice for today's 
high performance, scale-out databases and other key web-scale 
software infrastructure solutions.

Participants will learn:
• The evolution of Ceph
• Ceph applicability to infrastructures such as OpenStack, 
OpenShift and other Kubernetes orchestration environments
• Why Ceph can't meet the block storage challenges of modern, 
scale-out, distributed databases, analytics and AI/ML workloads:
• Where Cephs falls short on consistent latency response
• Overcoming Ceph’s performance issues during rebuilds
• How you can deploy high performance, low latency block storage in 
the same environments Ceph integrates with, alongside your Ceph 
deployment
logstash-plugins
https://github.com/logstash-plugins
Plugins that can extend Logstash's functionality.
   logstash-codec-protobuf  parsing Protobuf messages
   logstash-integration-kafka
   logstash-filter-elasticsearch
   amazon-s3-storage
   logstash-codec-csv
   ...
awesome influxdb
https://github.com/PoeBlu/awesome-influxdb
A curated list of awesome projects, libraries, tools, etc. related to InfluxDB
Tools whose primary or sole purpose is to feed data into InfluxDB.

- accelerometer2influx - Android application that takes the x-y-z axis metrics 
   from your phone accelerometer and sends the data to InfluxDB.
- agento - Client/server collecting near realtime metrics from Linux hosts
- aggregateD - A dogstatsD inspired metrics and event aggregation daemon for 
  InfluxDB
- aprs2influxdb - Interfaces ham radio APRS-IS servers and saves packet data 
  into an influxdb database
- Charmander - Charmander is a lab environment for measuring and analyzing 
  resource-scheduling algorithms
- gopherwx - a service that pulls live weather data from a Davis Instruments 
  Vantage Pro2 station and stores it in InfluxDB
- grade - Track Go benchmark performance over time by storing results in 
  InfluxDB
- Influx-Capacitor - Influx-Capacitor collects metrics from windows machines 
  using Performance Counters. Data is sent to influxDB to be viewable by grafana
- Influxdb-Powershell - Powershell script to send Windows Performance counters 
  to an InfluxDB Server
- influxdb-logger - SmartApp to log SmartThings device attributes to an 
  InfluxDB database
- influxdb-sqlserver - Collect Microsoft SQL Server metrics for reporting to 
  InfluxDB and visualize them with Grafana
- k6 - A modern load testing tool, using Go and JavaScript
- marathon-event-metrics - a tool for reporting Marathon events to InfluxDB
- mesos-influxdb-collector - Lightweight mesos stats collector for InfluxDB
- mqforward - MQTT to influxdb forwarder
- node-opcua-logger - Collect industrial data from OPC UA Servers
- ntp_checker - compares internal NTP sources and warns if the offset between 
  servers exceeds a definable (fraction of) seconds
- proc_to_influxdb - Console app to observe Windows process starts and stops 
  via InfluxDB
- pysysinfo_influxdb - Periodically send system information into influxdb (uses 
  python3 + psutil, so it also works under Windows)
- sysinfo_influxdb - Collect and send system (linux) info to InfluxDB
- snmpcollector - A full featured Generic SNMP data collector with Web 
  Administration Interface for InfluxDB
- Telegraf - (Official) plugin-driven server agent for reporting metrics into 
  InfluxDB
- tesla-streamer - Streams data from Tesla Model S to InfluxDB (rake task)
- traffic_stats - Acquires and stores statistics about CDNs controlled by 
  Apache Traffic Control
- vsphere-influxdb-go - Collect VMware vSphere, vCenter and ESXi performance 
  metrics and send them to InfluxDB
Consistency models
(in distributed databases)
ºFive consistency modelsº natively supported by the A.Cosmos DBºSQL APIº
(SQL API is default API):
- native support for wire protocol-compatible APIs for 
  popular databases is also provided including
 ºMongoDB, Cassandra, Gremlin, and Azure Table storageº.
  RºWARN:º These databases don't offer precisely defined consistency
    models or SLA-backed guarantees for consistency levels.
    They typically provide only a subset of the five consistency
    models offered by A.Cosmos DB.
- For SQL API|Gremlin API|Table API default consistency level
  configured on theºA.Cosmos DB accountºis used.

Comparative Cassandra vs Cosmos DB:
Cassandra 4.x       Cosmos DB           Cosmos DB
                    (multi-region)      (single region)
ONE, TWO, THREE     Consistent prefix   Consistent prefix
LOCAL_ONE           Consistent prefix   Consistent prefix
QUORUM, ALL, SERIAL Bounded stale.(def) Strong
                    Strong in Priv.Prev
LOCAL_QUORUM        Bounded staleness   Strong
LOCAL_SERIAL        Bounded staleness   Strong

Comparative MongoDB 3.4 vs Cosmos DB

MongoDB 3.4         Cosmos DB           Cosmos DB
                    (multi-region)      (single region)
Linearizable        Strong              Strong
Majority            Bounded staleness   Strong
Local               Consistent prefix   Consistent prefix
MFA OOSS
https://opensource.com/article/20/3/open-source-multi-factor-authentication Open source alternative for multi-factor authentication: privacyIDEA 
Chrony (NTP replacement)
https://www.infoq.com/news/2020/03/ntp-chrony-facebook/ 
Facebook’s Switch from ntpd to chrony for a More Accurate, Scalable NTP Service
Apache Beam
Apache Beam provides an advanced unified programming model, allowing 
you to implement batch and streaming data processing jobs that can 
run on any execution engine.
Allows to execute pipelines on multiple environments such as Apache 
Apex, Apache Flink, Apache Spark among others.
Apache Ignite
Apache Ignite is a high-performance, integrated and distributed 
in-memory platform for computing and transacting on large-scale data 
sets in real-time, orders of magnitude faster than possible with 
traditional disk-based or flash technologies.
Apache PrestoDB
@[https://prestodb.io/]
- distributed SQL query engine originally developed by Facebook.
- running interactive analytic queries against data sources of
  all sizes ranging from gigabytes to petabytes.
- Engine can combine data from multiple sources (RDBMS, No-SQL, 
  Hadoop) within a single query, and it has little to no performance 
  degradation running. Being used and developed by big data giants 
  like, among others,  Facebook, Twitter and Netflix, guarantees a 
  bright future for this tool.
- Designed and written from the ground up for interactive 
  analytics and approaches the speed of commercial data warehouses 
  while scaling to the size of organizations like Facebook. 
- NOTE: Presto uses Apache Avro to represent data with schemas.
  (Avro is "similar" to Google Protobuf/gRPC)
Google Colossus FS
@[https://www.systutorials.com/3202/colossus-successor-to-google-file-system-gfs/]
ddbb from scratch
Writing a SQL database from scratch in Go: 3. indexes | notes.eatonphil.com !!!
https://notes.eatonphil.com/database-basics-indexes.html 
Mattermost+Discourse
CERN, cambia el uso de Facebook Workplace por Mattermost y Discourse
https://www.linuxadictos.com/cern-cambia-el-uso-de-facebook-workplace-por-mattermost-y-discourse.html 
Intel Optane DC performance
http://mikaelronstrom.blogspot.com/2020/02/benchmarking-5-tb-data-node-in-ndb.html?m=1
Through the courtesy of Intel I have access to a machine with 6 TB of Intel
Optane DC Persistent Memory. This is memory that can be used both as
persistent memory in App Direct Mode or simply used as a very large
DRAM in Memory Mode.

Slides for a presentation of this is available at slideshare.net.

This memory can be bigger than DRAM, but has some different characteristics
compared to DRAM. Due to this different characteristics all accesses to this
memory goes through a cache and here the cache is the entire DRAM in the
machine.

In the test machine there was a 768 GB DRAM acting as a cache for the
6 TB of persistent memory. When a miss happens in the DRAM cache
one has to go towards the persistent memory instead. The persistent memory
has higher latency and lower throughput. Thus it is important as a programmer
to ensure that your product can work with this new memory.

What one can expect performance-wise is that performance will be similar to
using DRAM as long as the working set is smaller than DRAM. As the working
set grows one expects the performance to drop a bit, but not in a very significant
manner.

We tested NDB Cluster using the DBT2 benchmark which is based on the
standard TPC-C benchmark but uses zero latency between transactions in
the benchmark client.

This benchmark has two phases, the first phase loads the data from 32 threads
where each threads loads one warehouse at a time. Each warehouse contains
almost 500.000 rows in a number of tables.
SOAP vs RESTfull vs ...
SOAP vs RESTfull vs JSON-RPC vs gRPC vs Drift
EventQL
eventql/eventql: Distributed "massively parallel" SQL query engine
https://github.com/eventql/eventql 
SAN vs ...
https://serversuit.com/community/technical-tips/view/storage-area-network,-and-other-storage-methods.html 
Storage Area Network, and Other Storage Methods
n-gram index
 "...Developed an indexing and search software, mostly in C. 
  The indexer is multi-threaded and computes a n-gram index 
  - stored in B-Trees - on hundreds of GB of data generated 
  every day in production. The associated search engine is also
  a multi-threaded, efficient C program."

- In the fields of computational linguistics and probability, an n-gram 
  is a contiguous sequence of n items from a given sample of text or 
  speech. The items can be phonemes, syllables, letters, words or base 
  pairs according to the application. The n-grams typically are 
  collected from a text or speech corpus. When the items are words, 
  n-grams may also be called shingles[clarification needed].
Apollo GraphQL Middelware
https://www.infoq.com/news/2020/06/apollo-platform-graphql/
Apollo Data Graph Platform: A GraphQL Middleware Layer for the Enterprise
Genesis: end-to-end testing
Genesis: An End-To-End Testing & Development Platform For Distributed Systems
https://github.com/EntEthAlliance/genesis/branches
Outputting data to collect for later analysis is as simple as writing 
a JSON object to stdout. You can review the data we collect from each 
test on the test details page in the dashboard, build your own 
metrics and analytics dashboards with Kibana, and write your own 
custom data analysis and reports using Jupyter Notebooks.
Graph Management Software
Exploitation:
 - Visualization: Gephi, Cytoscape, LinkUrious
 - APIs: Apache TinkerPop, 

- DDBB:
  - JanusGraph, Neo4j, DataStax, MarkLogic, ...
- Processing Systems:
  - Spark, Apache Giraph, Oracle Labs PGX,
- Storage:
  - Hadoop, Neo4j, Apache HBase, Cassandra.
fallacies of distributed computing
@[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing]
The fallacies are[1]

    The network is reliable;
    Latency is zero;
    Bandwidth is infinite;
    The network is secure;
    Topology doesn't change;
    There is one administrator;
    Transport cost is zero;
    The network is homogeneous.

The effects of the fallacies
- Software applications are written with little error-handling on 
  networking errors. During a network outage, such applications may 
  stall or infinitely wait for an answer packet, permanently consuming 
  memory or other resources. When the failed network becomes available, 
  those applications may also fail to retry any stalled operations or 
  require a (manual) restart.
- Ignorance of network latency, and of the packet loss it can cause, 
  induces application- and transport-layer developers to allow 
  unbounded traffic, greatly increasing dropped packets and wasting 
  bandwidth.
- Ignorance of bandwidth limits on the part of traffic senders can 
  result in bottlenecks.
- Complacency regarding network security results in being blindsided 
  by malicious users and programs that continually adapt to security 
  measures.[2]
- Changes in network topology can have effects on both bandwidth and 
  latency issues, and therefore can have similar problems.
- Multiple administrators, as with subnets for rival companies, may 
  institute conflicting policies of which senders of network traffic 
  must be aware in order to complete their desired paths.
- The "hidden" costs of building and maintaining a network or subnet 
  are non-negligible and must consequently be noted in budgets to avoid 
  vast shortfalls.
- If a system assumes a homogeneous network, then it can lead to the 
  same problems that result from the first three fallacies.
Jepsen Project
https://jepsen.io/
Jepsen is an effort to improve the safety of distributed databases, queues, 
consensus systems, etc. We maintain an open source software library for systems 
testing, as well as blog posts and conference talks exploring particular 
systems’ failure modes. In each analysis we explore whether the system lives 
up to its documentation’s claims, file new bugs, and suggest recommendations 
for operators.

Jepsen pushes vendors to make accurate claims and test their software 
rigorously, helps users choose databases and queues that fit their needs, and 
teaches engineers how to evaluate distributed systems correctness for 
themselves.