DB Engines Types
Ext.Links
@[https://db-engines.com/]
@[https://db-engines.com/en/ranking]
@[https://dzone.com/articles/rant-there-is-no-nosql-data-storage-engine]

TODO
@[https://www.infoq.com/news/2018/09/pinterest-goku-timeseries-db]
  Pinterest Switches from OpenTSDB to Their Own Time Series Database
- https://facebook.github.io/prophet/
  Prophet: TSDB forecasting library (wrapper around Stan), particularly
  approachable place to use Bayesian Inference for forecasting use cases
  general purpose.
  - Prophet is a procedure for forecasting time series data based on an
    additive model where non-linear trends are fit with yearly, weekly, and daily
    seasonality, plus holiday effects. It works best with time series that have
    strong seasonal effects and several seasons of historical data. Prophet is
    robust to missing data and shifts in the trend, and typically handles outliers well.

@[https:// technologyconversations.com/2015/09/08/service-discovery-zookeeper-vs-etcd-vs-consul/]
@[https://www.infoq.com/news/2019/05/hashicorp-consul-1.5.0]
Consistency models
(in distributed databases)
ºFive consistency modelsº natively supported by the A.Cosmos DBºSQL APIº
(SQL API is default API):
- native support for wire protocol-compatible APIs for 
  popular databases is also provided including
 ºMongoDB, Cassandra, Gremlin, and Azure Table storageº.
  RºWARN:º These databases don't offer precisely defined consistency
    models or SLA-backed guarantees for consistency levels.
    They typically provide only a subset of the five consistency
    models offered by A.Cosmos DB.
- For SQL API|Gremlin API|Table API default consistency level
  configured on theºA.Cosmos DB accountºis used.

Comparative Cassandra vs Cosmos DB:
Cassandra 4.x       Cosmos DB           Cosmos DB
                    (multi-region)      (single region)
ONE, TWO, THREE     Consistent prefix   Consistent prefix
LOCAL_ONE           Consistent prefix   Consistent prefix
QUORUM, ALL, SERIAL Bounded stale.(def) Strong
                    Strong in Priv.Prev
LOCAL_QUORUM        Bounded staleness   Strong
LOCAL_SERIAL        Bounded staleness   Strong

Comparative MongoDB 3.4 vs Cosmos DB

MongoDB 3.4         Cosmos DB           Cosmos DB
                    (multi-region)      (single region)
Linearizable        Strong              Strong
Majority            Bounded staleness   Strong
Local               Consistent prefix   Consistent prefix

Distributed DB Eventual/Strong Consistency
@[https://hackernoon.com/eventual-vs-strong-consistency-in-distributed-databases-282fdad37cf7]
dbEngine: Data Topology compared
  Key-value:    wide-Column:         Document
  o -- O        o -- O O O O ... O   O→O→O...
  o -- O        o -- O O O O ... O    ↘    
  o -- O        o -- O O O O ... O     O→O
  ...           ...                     ↘   
                                         O→O
                                          ↘ 
                                           O  
Graph DBMS
- represent data in graph structures of nodes (entities) and edges(relations).
- Relations are are first class citizens (vs foreign keys in SQL ddbbs, 
  "Relational DDBBs lack relationships, some queries are very easy, some
   others require complex joins" ).

- BºModeling real world is easier than in SQL/RDBMs.  º
  BºAlso, they are resillient to model/schema changes.º
- Sort of "hyper-relational" databases.
- Two  properties should  be considered in graph DDBBs:
  └ºunderlying storageº:
    - native optimized graph storage 
    - Non native (serialization to relational/Document/key-value/...  ddbbs.
  └ºprocessing engineº:
    -ºindex-free adjacencyº: ("true graph ddbb"), connected  nodes  physically
      "point"  to  each  other  in  the  database.
    BºMuch faster when doing graph-like queries.º Query performance 
      remains (aproximately) constant in "many-joins" scenarios, while RDBMs
      degrades exponentially with JOINs.
      Time is proportional to the size of the path/s traversed by the query,
      (vs subset of the overall table/s).
    Rº(usually) they DON'T provide indexes on all nodes:º
    Rº- Direct access to nodes based on attribute values is not possibleº
    Rº  or slower.º
    -ºnon-index-free adjacencyº: External indexes are used to link nodes.
    BºLower memory use.º 

          Native |·Infi.Graph        ·OrientDB ·Neo4j
                 |·Titan             ·Affinity ·*dex
           ^^^^^ |
           Graph |
      Processing |                   ·Franz Inc
           vvvvv |                   ·Allegro Graph
                 |·FlockDB           ·HyperGraphDB
                 └---------------------------------
              non         ← Graph  →      Native
              Native        Storage
- Graph Comute Engines execute graph algorithms against large datasets. 
  - Example algorithms include:
    - Idetify clusters in data
    - Answer "how many average relationships do node have"
    - Find "paths" to get from one node to another.
 
  - It can be independent of the underlying storage or not.
  - They can be classified by:
    - in-memory/single machine (Cassovary,...)
    - Distributed: Pegasys, Giraph, ... Most of the based on the white-paper
    BºPregel: A system fro large-scale graph processingº (2010 ACM)
    @[https://dl.acm.org/doi/10.1145/1807167.1807184]
    "Many practical computing problems concern large graphs. ... billions of 
     vertices, trillions of edges ... In this paper we present a
     computational model suitable for this task. "


- GraphQL ISO standard Work in place integrating the 
  ideas from  openCypher (Neo4j contribution), PGQL, GSQL,
  and G-CORE languages.
  - @[https://www.gqlstandards.org/]
- OpenCypher support:
  - Neo4jDB        - Cypher for Spark/Gremin
  - AgensGraph     - Mengraph
  - RedisGraph     - inGraph
  - SAPHANA Graph  - Cypher.PL

BºExamplesº
- Neo4j
- AGE: @[https://www.postgresql.org/about/news/2050/]
  - multi-model graph database extension for PostgreSQL 
  - Support for Subset of Cypher Expressions through
    @[https://www.opencypher.org/]
- AWS Neptune
- Azure Cosmos DB
- Datastax Enterprise
- OrientDB
- ArangoDB

- there areºtwo primary implementationsºboth consisting of
  nodes(or "vertices") and directed edges (or "links"). 
  Data models looking slightly different on each implementation.
  Both allow properties (attribute/value pairs) to be
  associated with nodes.
  -ºgraphs—property graphs (PG)º
    They also allow attributes for edges.
    - no open standards exist for schema definition,
      query languages, or data interchange formats. 
  -ºW3C Resource Description Framework (RDF)º
    Node properties treated simply as more edges.
    (techniques exists to express edge properties in RDF).
    - graphs can be represented as edges or "triples":
      edge triple = ( starting point, label     , end point)  or
      edge triple = (subject        , "predicate, object   )
      (Also called "triple stores")
    - RDF forms part of (a suite of) standardized W3C specs.
      built on other web standards, collectively known as
    BºSemantic Web, or Linked Dataº. They include:
      - schema languages (RDFS and OWL):
      - declarative query language (SPARQL)
      - serialization formats
      - Supporting specifications:
        - mapping RDBMs to RDF graphs
      -Bºstandardized framework for inferenceº
       Bº(ex, drawing conclusions from data)º

    - Eclipse rdf4j: https://rdf4j.org/
      OOSS modular Java framework for working with RDF data:
      parsing/storing/inferencing/querying of/over such data.
      - easy-to-use API 
      - can be connected to all leading RDF storage solutions.
        with Two out-of-the-box RDF databases (the in-memory store 
        and the native store).
      - connect with SPARQL endpoints to leverage the power
        of Linked Data and Semantic Web.
        with SPARQL 1.1 query and update language
      - The framework offers a large scala of tools 
        repositories using the exact same API as for local access.
      - supports mainstream RDF file formats:
        - RDF/XML, Turtle, N-Triples, N-Quads, JSON-LD, TriG and TriX.
      - Used for example in  semanticturkey:
        http://semanticturkey.uniroma2.it/doc/dev/

  PG resemble conventional data structures in an isolated
  application or use case, whereas RDF are originally designed
  to support interoperability and interchange across 
  independently developed applications.

  - (@[https://www.allthingsdistributed.com/2019/12/power-of-relationships.html])
    "...Because developers ultimately just want to do graphs, you 
     can choose to do fast Apache TinkerPop Gremlin traversals for
     property graph or tuned SPARQL queries over RDF graphs..."

   - TinkerPop:
   @[http://tinkerpop.apache.org/]
     open source, vendor-agnostic, graph computing framework.
     When a data system is TinkerPop-enabled, users are able
     to model their domain as a graph and analyze that graph
     using the Gremlin graph traversal language. 
     ...Sometimes an application is best served by an in-memory, 
     transactional graph database. Sometimes a multi-machine
     distributed graph database will do the job or perhaps the
     application requires both a distributed graph database for
     real-time queries and, in parallel, a Big(Graph)Data processor
     for batch analytics. 

BºGraph Management Software Summaryº
  - Exploitation:
    - Visualization: Gephi, Cytoscape, LinkUrious
    - APIs: Apache TinkerPop, 
  - DDBB:
    - JanusGraph, Neo4j, DataStax, MarkLogic, ...
  - Processing Systems:
    - Spark, Apache Giraph, Oracle Labs PGX,
  - Storage:
    - Hadoop, Neo4j, Apache HBase, Cassandra.
Graph analytics and rendering

BºNetWorkXº
https://networkx.org/documentation/stable/index.html
- Python package for the creation, manipulation, and 
  study of the structure, dynamics, and functions of complex networks.
  - Gallery:
  @[https://networkx.org/documentation/stable/auto_examples/index.html]

BºGraphViz Graph Visualizationº
@[http://www.graphviz.org]
http://www.graphviz.org/gallery/
Do graph DBs scale
https://dzone.com/articles/do-graph-databases-scale
Distributed Graph DB
Query
@[https://www.infoq.com/presentations/graph-query-distributed-execution/]
OWL
https://en.wikipedia.org/wiki/Web_Ontology_Language

https://www.w3.org/wiki/Ontology_editors

The Web Ontology Language (OWL) is a family of knowledge 
representation languages for authoring ontologies. Ontologies are a 
formal way to describe taxonomies and classification networks, 
essentially defining the structure of knowledge for various domains: 
the nouns representing classes of objects and the verbs representing 
relations between the objects. Ontologies resemble class hierarchies 
in object-oriented programming but there are several critical 
differences. Class hierarchies are meant to represent structures used 
in source code that evolve fairly slowly (typically monthly 
revisions) whereas ontologies are meant to represent information on 
the Internet and are expected to be evolving almost constantly. 
Similarly, ontologies are typically far more flexible as they are 
meant to represent information on the Internet coming from all sorts 
of heterogeneous data sources. Class hierarchies on the other hand 
are meant to be fairly static and rely on far less diverse and more 
structured sources of data such as corporate databases.[1]


http://owlgred.lumii.lv/


https://www.cognitum.eu/Semantics/FluentEditor/  
RDBMS
ºRDBMSº
ºFeaturesº
- support E-R data model.
- table schema defined by the
  table name and fixed number
  of attributes and data types.
- A record (=entity) corresponds
  to a row in the table.
- basic operations are defined on
  the tables/relations:
  - CRUD: Create/Read/Update/Delete
  - set ops: union|intersect|difference
  - subset selection defined by filters
  - Projection of subset of table columns
  - JOIN: combination of:
    Cartesian_product+selection+projection
  - TX ACID control
  - user management
- Ops defined in  standard SQL
ºHistoryº
- beginning of 1980s
- Most widely used DBMS

ºMain  Examplesº
- PostgreSQL
- Oracle
- MySQL/MariaDB
  - @[http://myrocks.io/] ← Built on top of key-value RocksDB.
                            optimized for SSD/Memory.
- SQLite / DQLite
- TiDB
  - MySQL compatible
  - RAFT-distributed
  - Rust/go written
  - Features:
    - "infinite" horizontal scalability
    - strong consistency
    - HA
- SQL Server
- DB2
- Hive (https://db-engines.com/en/system/Hive)
  - Home: https://hive.apache.org/
  - RºWARNº: No Foreign keys, NO ACID
  -ºEventual Consistencyº
  - data warehouse software facilitates reading, 
    writing, and managing large datasets residing in 
    distributed storage using SQL.  (Hadoop,...)
  - 2.0+ support Spark as execution engine.
  - Implemented in Java
  - supports analysis of large datasets
    stored in Hadoop's HDFS and
    compatible file systems such as
    Amazon S2 filesystem.
  - SQL-like DML and DDL statements
    Traditional SQL queries implemented in
    MapReduce Java API to  execute SQL apps:
    - necessary SQL abstraction provided to
      integrate SQL-like queries (HiveQL) into
      the underlying Java without the need
      for low-level queries.
  - Hive aids portability of SQL-based apps
    to Hadoop
  - JDBC, ODBC, Thrift

- SQL "summary":
  Beginner:                    Intermediate:AA                                 Advanced:
   1 What and Why of SQL?       5 LIKE, AND/OR and DISTINCT                     9  JOINT ... UNIONS ...
   2 Types of SQL               6 NULLs and Dates                              10  OVER ... PARTITION BY ...
   3 Table, Rows and Colums     7 COUNT, SUM, AVG ... GROUP BY ... HAVING...   11  STRUCT ... UNNEST ..
   4 SELECT ... FROM ...        8 Alias, Sub-queries and WITH ...AS ..         12  Reguar Expresions
       WHERE ... ORDER BY ...
Materialize: Stream 2 PSQL
@[https://github.com/MaterializeInc/materialize]
- Streaming database for real-time applications, accepting input data from
  Kafka, CSV files, ... to query them using SQL.
- ask questions about live data
- Refreshed answers in millisecs.
- Designed to help interactively explore streaming data.
- perform data warehousing analytics against live relational data.
- increase the freshness and reduce the load of dashboards/monitoring tasks.

- provide correct/consistent answers with minimal latency.
  (vs approximate answers/eventual consistency).
- recast SQL92 queries as dataflows.
- Support for a large fraction of PostgreSQL, and are actively working 
  to support builting PostgreSQL functions.
dqlite.io
@[https://dqlite.io/]
Dqlite (“distributed SQLite”) extends SQLite across a cluster of machines,
with
automatic failover and high-availability to keep your application running. It
uses C-Raft, an optimised Raft implementation in C, to gain high-performance
transactional consensus and fault tolerance while preserving SQlite’s
outstanding efficiency and tiny footprint.
SchemaSpy
@[http://schemaspy.org/]
- SchemaSpy is generates database to HTML documentation, including Entity Relationship diagrams.
KEY-VALUE
ºKEY-VALUE STORESº
-ºsimplestºform of DBMS.
- store pairs of keys and values
-ºhigh performanceº
- not adequate for complex apps.
  Low data consistency in distributed clusters,
  when compared to Registry Based.
- value is not ussually "big" (when compared
  to wide-column DDBBs like Cassandra,..). 
  Not designed for high-data-layer apps, but for low-level software tasks.
- extended forms allows to sort the keys, enabling range queries as well as an 
  ordered processing of keys.
- Can evolve to document stores and wide column stores.

ºExamplesº
-ºRedisº: Cache/RAM-Memory oriented. Not strong cluster consistency 
          like etcd / Consul / Zookeeper.
  - Comparative: redis (cache/simple key-value ddbb) 
                 vs
                 etcd  (key-value Registry DDBB):
    - etcd is designed with high availability in mind,
      being distributed by default - with RAFT consensus algorithm -
      and providing consistency through node failures.
      redis doesn't. 
    - etcd is persisted to disk by default. Redis is not.
     (logs and snapshots are possibles) 
    - Redis/memecached is a blazing fast in-memory key-value store 
    - etcd read performance is good for most purposes, but  1-2 orders
      of magnitude lower than redis. Write gap is even bigger.

  - Comparative: redis     (cache/simple key-value ddbb) 
                 vs
                 memcached (cache/simple key-value ddbb) 
    - redis: able to store typed data (lists, hashes, ...)
      and supporting lot of different programming patterns out of
      the box, while memcached doesn't.

-ºAmazon DynamoDBº
-ºMemcachedº

-ºGuava Cacheº
@[https://github.com/google/guava/wiki/CachesExplained]
  - ºnon-distributedº easy-to-use Java library for data caching
  - A Cache is similar to ConcurrentMap, but not quite the same. The most
    fundamental difference is that a ConcurrentMap persists all elements that
    are added to it until they are explicitly removed. A Cache on the other
    hand is generally configured to evict entries automatically, in order to
    constrain its memory footprint. In some cases a LoadingCache can be useful
    even if it doesn't evict entries, due to its automatic cache loading.

-ºHazelcastº
  - Designed for caching.
  - Hazelcast clients, by default, will
    connect to all cache cluster nodes
    an know about the cluster partition
    table to route request to the correct
    node.
  - Java JCache compliant
  - Advanced cache eviction algorithms
    based on heuristics with sanitized O(1)
    runtime behavior.

- LabelDB ("Ethereum State"): Non distributed, with
  focus in local-storage persistence.
- TiKV
  - dev lang:ºRustº
  - Incubation Kubernetes project
  - used also for TiDB
-ºRocksDB:º(by Facebook)
  - built on earlier work on LevelDB
  - core building block for fast key-value server.
  - Focus: storing data on flash drives.
  - It has a Log-Structured-Merge-Database (LSM) design with
    flexible tradeoffs between Write-Amplification-Factor (WAF),
    Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF).
  - It has multi-threaded compactions, making itºespecially suitableº
   ºfor storing multiple terabytes of data in a singleº
   ºlocal (vs distributed) database.º.
   ºIt can be used for big-data processing on single-node infrastructureº
  - Used by Geth/Besu/... and other Ethereum clients ...
RocksDB
@[https://raw.githubusercontent.com/facebook/rocksdb/gh-pages-old/intro.pdf]

Local (embedded in library with no network latency) key-value
database designed for high-performance:
- Prefer in order: RAM → SSD (focus or RocksDB) → HD → Network Storage
- Optimized for server loads (multiple clients reading/writing).
-BºBased on LevelDB, but 10x faster for writesº, thanks to
  new architecture, index algorithm (bloom filters, ...)
- C++ library with up-to-date java wrappers.

RºWhat it is not?º
Rº- Not distributedº
Rº- No failoverº
Rº- Not highly available. data lost if machine breaks.º
    (But highly available architectures can be made on top
    of it adding consensus protocols ... and loosing performance).

- keys and values are byte streams.

- common operations are:
  - Get(key)
  - NewIterator()
  - Put(key, val)
  - Delete(key)
  - SingleDelete(key).

- support for multi-operational transactions, both optimistic and pessimistic mode. 
- RocksDB has a Write Ahead Log (WAL):
  - Puts stored in an in-memory buffer called the memtable as
    well as optionally inserted into WAL. On restart, it re-processes all
    the transactions that were recorded in the log.
- Data Checksuming to detect corruptions for each SST file block.
BºA block, once written to storage, is never modified.º

Search
- "NoSQL" DBMS for search of content
- Optimized for:
  - complex search expressions
  - Full text search
  - reducing words to stem
  - Ranking and grouping of results
  - Geospatial search
  - Distributed search for high
    scalability
ºExamplesº
- Apache Lucene: Powerful search java Library used as the base for
                 powerful solutions.

- Elasticsearch: Both are built on top of Lucene.
  Solr           adding server, cluster and escalability features.
                 - When compared Elasticsearch is simpler to use 
                   and integrates with Kibana graphics.
                 - Solr has better documenttion.
                 
- Splunk        
- MarkLogic
- Sphinx
- Eclipse Hawk:
  @[projects.eclipse.org/projects/modeling.hawk]
  heterogeneous model indexing framework:
  - indexes collections of models
    transparently and incrementally into
    NoSQL DDBB, which can be queried
    efficiently.
  - can mirror EMF, UML or Modelio
    models (among others) into a Neo4j or
    OrientDB graph, that can be queried
    with native languages, or Hawk ones.
    Hawk will watch models and update
    the graph incrementally on change.
Time Series
- Managing time series data
- very high TX load:
  - designed to efficiently collect,
    store and query various time
    series.
- Ex use-case:
  SELECT SENSOR1_CPU_FREQUENCY / SENSOR2_HEAT'
  joins two time series based on the
  overlapping areas of time providing
  new time-serie
- Time/space constrained data can be of two basic types:
  - Point in time/space
  - Region in time/space

ºExamplesº
- Timescale.com
  - PostgreSQL optimized for Time Series.
    by modifying the insert path,
    execution engine, and query planner
    to "intelligently" process queries
    across chunks.
- QuestDB:
  - https://questdb.io/
  - RelationalModel.
  - PostgresSQL wire protocol.(TODO)
  - "Query 1.6B rows in millisecs"
  - Relational Joins + Time-Series Joins
  - Unlimited Transaction Size
  - Cross Platform.
  - Java Embedded.
  - Telegraf TCP/UDP

- InfluxDB
- Kdb+
- Graphite. Simple system that will just:
  - Store numeric time series data
  - Render graphs of this data
  It willºNOTº:
  - collect data (Carbon needed)
  - render graphs (external apps exists)
- RRDtool
- Prometheus
  - https://prometheus.io/
  - See also: Cortex: horizontally scalable, highly available,
                      multi-tenant, long term storage for Prometheus.
  - CNCF project,ºkubernetes friendlyº
  - GoLang based:
    - binaries statically linked
    - easy to deploy.
  - Many client libraries.
  - monitoring metrics analyzer
    and alerting
  - Highly dimensional data model.
    Time series are identified by a
    metric name and a set of
    key-value pairs.
  - stores all data as time series:
    streams of timestamped values
    belonging to same metric and
    same set of labeled dimensions.
  - Multi-mode data visualization.
  - Grafana integration
  - Built-in expression browser.
  - PromQL Query language allowing
    to select+aggregate time-series
    data in real time.
    - result can either be shown as
      graph, tabular data or consumed
      by external HTTP API.
  - Precise alerts based on PromQL.
  - scaling:
    - sharding
    - federation
  - "Singleton" servers relying only
    on local storage.
    (optionally remote storage).

- Uber M3
  www.infoq.com/news/2018/08/uber-metrics-m3
  - Large Scale Metrics Platform
""" built to replace Graphite+Carbon cluster,
    and Nagios for alerting and Grafana for
    dashboarding due to issues like
    poor resiliency/clustering, operational
    cost to expand the Carbon cluster,
    and a lack of replication""".
  ºFeaturesº
  - cluster management, aggregation,
    collection, storage management,
    a distributed TSDB
  - M3QL query language (with features
    not available in PromQL).
  - tagging of metrics.
  - local/remote integration similar to
    the Prometheus Thanos extension providing
    cross-cluster federation, unlimited
    storage and global querying across
    clusters, works.
  - query engine: single global view
    without cross region replication.

- Redis TimeSeries:
  - By default Redis is just a key-value store.
  @[https://redislabs.com/redis-enterprise/redis-time-series/]
    RedisTimeSeries simplifies the use of Redis for time-series use
    cases like IoT, stock prices, and telemetry.
  - See also:
  @[https://www.infoq.com/articles/redis-time-series-grafana-real-time-analytics/]
    How to Use Redis TimeSeries with Grafana for Real-time Analytics
TS Popularity
@[https://www.techrepublic.com/article/why-time-series-databases-are-exploding-in-popularity/]
M3DB
M3DB: (Uber) OOSS distributed timeseries database 
- ability to shard its metrics into partitions,
  replicate them by a factor of three,
  and then evenly disperse the replicas across separate 
  failure domains.
RDF stores
@[https://db-engines.com/en/article/RDF+Stores]
@[https://en.wikipedia.org/wiki/Resource_Description_Framework]
- Resource Description Framework stores
  ºdescribes information in triplets:º
 º(subject,predicate,object)º
- Originally used for describing
  IT-resources-metadata.
- Today often used in semantic web.
- RDS is a subclass of graph DBMS:
  (subject,predicate,object)
   ^          ^      ^
   node      edge   node
  but it offer specific methods
  beyond general graph DBMS ones. Ex:
  SPARQL, SQL-like query lang. for
  RDF data, supported by most
  RDF stores.

ºExamplesº
- MarkLogic
- Jena
- Virtuoso
- Amazon Neptune
- GraphDB
- Apache Rya:
JSON-LD
- format for mapping JSON data into the RDF semantic graph model, as defined by [JSON-LD]. 

@[https://en.wikipedia.org/wiki/JSON-LD
Apache Rya
@[https://searchdatamanagement.techtarget.com/news/252472464/Apache-Rya-matures-open-source-triple-store-database]
The project started at the U.S. government's Laboratory for Telecommunication
Sciences with an initial research paper published in 2012.

Among Rya's users is the U.S. Navy, which is using the open source technology
in a number of efforts, including one for autonomous drones. The project got
started because there was a need for a scalable RDF triple store and no
existing system that met all requirements, said Adina Crainiceanu, vice
president of Apache Rya and associate professor of computer science at the U.S.
Naval Academy.

Service Registry
 ºand Discoveryº

 - Specialized key/value DBMS:
   -ºtwo processes must existsº:
     -ºService registration processº
       storing final-app-service
       (host,port,...)
     -ºService discovery processº
       - let final-app-services query
         the data
   - other aspects to consider:
     - auto-delete of non-available services
     - Support for replicated services
     - Remote API is provided

 ºExamplesº
 -ºZooKeeperº
   - originated from Hadoop ecosystem.
     now is core part of most cluster based Apache
     projects (hadoop, kafka, solr,...)
   - data-format similar to file system.
   - cluster mode.
   - Disadvantages:
     - complex:
     - Java plus big number of dependencies
   - Still used by kafka for config but
     plans exists to replace.
   @[https://issues.apache.org/jira/browse/KAFKA-6598]
   @[https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum] - 2019-11-06 -
 -ºetcdº
   - distributed versioned key/value store
   - HTTPv2/gRPC remote API.
   - based on previous experiences with ZooKeeper.
   - hierarchical config.
   - Very easy to deploy, setup and use
     (implemented in go with self-executing binary)
   - reliable data persistence
   - very good doc
   - @https://coreos.com/blog/introducing-zetcd]
   - See etcd vs Consul vs ZooKeeper vs NewSQL
     comparative at:
   @[https://etcd.io/docs/v3.4.0/learning/why/]
     and:
   @[https://loneidealist.wordpress.com/2017/07/12/apache-zookeeper-vs-etcd3/]
   Disadvantages:
   - needs to be combined with few
     third-party tools for  serv.discover:
     - etcd-registrator: keep updated  list
                       of docker containers
     - etcd-registrator-confd: keep updated
                               config files
     - ...
   - Core-component of Kubernetes for cluster
     configuration.

 -ºConsulº
   - strongly consistent datastore
   - multidatacenter gossip protocol
     for dynamic clusters
   - hierarchical key/value store
   - adds the notion of app-service
     (and app-service-data).
     - "watches" can be used for:
       - sending notifications of
         data changes
       - (HTTP, TTLs , custom)
         health checks and
         output-dependent
         commands

   - embedded service discovery:
     no need to use third-party one
     (like etcd). Discovery includes:
     - node         health checks
     - app-services health checks
     - ...
   - Consul provides a built in framework
     for service discovery.
     (vs etcd basic key/value + custom code)
     - Clients just register services and
       perform discovery using the DNS
       or HTTP interface.
   - out of the box native support for
     multiple datacenters.
   - template-support for config files.
   - Web UI: display all services and nodes,
     monitor health checks, switch
     from one datacenter to another.
 - doozerd (TODO)
 See also:
 - Comparision chart:
 coreos.com/etcd/docs/latest/learning/why.html
etcd 101
"etcd" name refers to unix "/etc" folder and "d"istributed system.

- etcd is designed as a general substrate for large scale distributed systems. 
  These are systems that will never tolerate split-brain operation and are 
  willing to sacrifice availability to achieve this end. etcd stores metadata in 
  a consistent and fault-tolerant way. An etcd cluster is meant to provide 
  key-value storage with best of class stability, reliability, scalability and 
  performance.

@[https://github.com/etcd-io/etcd]
@[https://etcd.io/docs/v3.4.0/]

@[https://etcd.io/docs/v3.4.0/learning/] 
- designed to:
  reliably (99.999999%) store infrequently updated data 
  reliable watch queries.
- multiversion persistent key-value store of key-value pairs:
  - inexpensive snapshots 
  - watch history events ("time travel queries").
- key-value store is effectively immutable unless 
  store is compacted to to shed oldest versions.

Logical view
- flat binary key space.
- key space: 
  - lexically sorted index on byte string keys
    → soºrange queries are inexpensiveº.
  - Each key/key-space maintains multiple revisions starting with version 1
- revisions are indexed as well
  →ºranging over revisions with watchers is efficientº

- generation: key life-span, from creation to deletion.
  1 key ←→ 1+ generation.

Physical view:
- data stored as key-value pairs in a persistent b+tree.
-ºkey of key-value pair is a 3-tuple (revision, sub,     type)º.
                                                ^        ^
                                                Unique   (opt)
                                                key-ID   special key-type
                                                in rev.  (deleted key,..)




@[https://jepsen.io/analyses/etcd-3.4.3]
- In our tests, etcd 3.4.3 lived up to its claims for key-value operations: we 
  observed nothing but strict-serializable consistency for reads, writes, and 
  even multi-key transactions, during process pauses, crashes, clock skew, 
  network partitions, and membership changes. Strict-serializable behavior was 
  the default for key-value operations; performing reads with the serializable 
  flag allowed stale reads, as documented.

@[https://etcd.io/docs/v3.4.0/learning/design-client/]
- etcd v3+ is fully committed to (HTTPv2)gRPC.
  For example: v3 auth has connection based authentication, 
  rather than v2's slower per-request authentication.

Wide Column Stores
(also called extensible record stores)
-ºshort of two-dimensionalº ºkey-value stores.º
  where key set as well as value can be "huge"
  (thousands of columns per value)

  - Compared to Service Registry DDBBs (Zookeeper,
    etcd3) that are also key-value stored :
    - Ser.Registry ddbb scale much less (many
      times a single node can suffice) but are
      strongly consistent. Wide Column Stores
      can scale to ten of thousands of nodes but
      consistency is eventual / optimistic.
  - column names and record keys ºare not fixedº
                                  └─────┬─────┘
                                  (schema-free)
- not to be confused with column oriented storage
  of RDMS. Last one is an internal concept
  for improving performance storing
  table-data column-by-column vs
  record-by-record)
ºExamplesº
-ºCassandraº
  - SQL-like SELECT, DML and DDL statements (CQL)
  - handle large amounts of data across farm
    of commodity servers, providing high availability
    with no single point of failure. (peer-to-peer)
  - Allows for different replication strategies
    (sync/async for slow-secure/fast-less-secure)
    cluster replication.
  - logical infrastructure support for clusters
    spanning multiple datacenters in different 
    continents.
  -ºIntegrated Managed Data    º
   ºLayer Solutions with Spark,º
   ºKafka and Elasticsearch.   º

- PostgreSQL + cstore_fdw extension: columnar store extension for analytics
  use cases where data is loaded in batches.
  Cstore_fdw’s columnar nature delivers performance by only reading relevant
  data from disk. It may compress data by 6 to 10 times
  to reduce space requirements for data archive.


-ºScyllaº
  - compatible with Cassandra (same CQL and Thrift protocols,
     and same SSTable file formats)
    with  higher throughputs and lower latencies (10x).
  - C++14 (vs Cassandra Java)
  -ºcustom network managementºthat minimizes resource by
   ºbypassing the Linux kernel. No system call is requiredº
    to complete a network request. Everything happens in userspace.
  -ºSeastar async lib replacing threadsº
  - sharded design by node:
    - each CPU core handles a data-subset
    - authors claim to achieve much better
      performance on modern NUMA SMP archs,
      and to scale very well with the number
      of cores:
      -ºUp to 2 million req/sec per machineº
-ºHBaseº. Apache alternative to Google BigTable
  Internally uses skip-lists:
@[../programming_theory.html?query=e65f9917-5f27-4b78-8a5a-0653640b6b88]
_ºCosmosDBº
REF: Google bigTable-osdi06.pdf
-ºGoogle Spannerº
Cassandra 101
RºWARN:º
- Why It's a Poor Choice For a Metadata Database for Object Stores
@[https://blog.min.io/the-trouble-with-cassandra-based-object-stores/]
  - Cassandra excels at supporting write-heavy workloads,
  - Cassandra have limitations when supporting read-heavy workloads
    due to its eventual consistency model and lack of transactions,
    multi-table support like joins, subqueries can also limit its usefulness.
  - Object storage needs are far simpler and different from what Cassandra
    is built for.
  - RºWARNº: Because the implications of employing Cassandra as a object
    storage metadata database were not properly understood, many object
    storage vendors made it a foundational part of their architecture. 
    keeping them from ever moving past simple archival workloads.





Cassandra whitepaper:
-ºdistributed key-value storeº
- developed at Facebook.
- data and processing spread out across many commodity servers
- highly available service without single point of failure
  allowing replication even across multiple data centers.
- Possibility to choose synchronous/asynchronous replication
  for each update.
-ºfully distributed DB, with no master DBº
- Reads are linearly scalable with the number of nodes.
  It scalates (much)better cassandra that comparables systems
  like hbase,voldermort voldtdb redis mysql
  Writes are linearyly scalable?

- based on 2 core technologies:
  - Google's Big Table
  - Amazon's Dynamo

- 2 versions of Cassandra:
  - Community Edition  : distributed under the Apache™ License
  - Enterprise Edition : distributed by Datastax

- Cassandra is a BASE system (vs an ACID system):
  BASE == (B)asically (A)vailable, (S)oft state, (E)ventually consistent)
  ACID == (A)tomicity, (C)onsistency, (I)solation, (D)urability.

Oº- BASE implies that the system is optimistic and accepts that   º
Oº  the database consistency will be in a state of fluxº
Oº- ACID is pessimistic and it forces consistency at the endº
Oº  of every transaction.º


               Replication
               Strategy
                   1
                   ↑
                   ↓
                   1
cluster 1 ←→ 1+ Keyspace 1 ←→ N Column Family   ←→ 0+ Row
             └┬┘                └─────┬─────┘
         Typically
         just one               sharing similar   - collection of
                                structure           sorted columns.

  ┌─ CQL lang. used to access data.
┌─┴───────────────────────────────────────────────────────────────┐
│OºKEYSPACEº: container for app.data, similar to a RDMS schema    │
│             ┌─────────────────────────────────────────────────┐ │
│             │ Column Family 1                                 │ │
│             └─────────────────────────────────────────────────┘ │
│             ┌─────────────────────────────────────────────────┐ │
│             │ Column Family 2                                 │ │
│             └─────────────────────────────────────────────────┘ │
│               ...                                               │
│ ┌─────────┐ ┌─────────────────────────────────────────────────┐ │
│ │Settings │ │ Column Family N      Col    Col2  ...  ColN     │ │
│ ├─────────┤ │                      Key1   Key2  ...  KeyN     │ │
│ │*Replica.│ │ ┌────────┐ ┌──────────────────────────────────┐ │ │
│ │ Strategy│ │ │Settings│ │RowKey1│Val1_1│Val2_1│   │ValN_1│ │ │ │
│ └─────────┘ │ │        │ │RowKey2│Val2_2│Val2_2│   │ValN_2│ │ │ │
│             │ └────────┘ │         ...                      │ │ │
│             │            └──────────────────────────────────┘ │ │
│             └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                            RowKey := Unique row UID
                                      Used also to sharde/balance
                                      load acrossºnodesº
                            ValX_Y := value ||
                                      value collection


     BASIC INFRASTRUCTURE ARCHITECTURE:
     ──────────────────────────────────
ºNodeº: (Cassandra peer instance)  ºRackº:
 - data storage (on File System     - Set of nodes
   or HDFS)
 - The data is balanced
   across nodes based
   on the RowKey f each
   Column Family.
 - Commit log to ensure
   data durability.
   - Sequentially written                    automatically
                                             replicated throughout
             ┌── Node "N" choosen            cluster
             │   based on RowKey                 ↑
             ↓                                ┌──┴───┐
Input → Commit Log → Index+write →(memTable → Write to    ──→ SSTable
        @Node N      toºmemTableº  full)     ºSSTableº        Compaction
                        └──┬───┘              └──┬───┘        └───┬────┘
           - in-memory structure   - (S)orted (S)tring Table  - Periodically consolidates
           - "Sort-of" write-back    in File System or HDFS     storage by discarding
             cache                                              data marked for deletion
                                                                with a tombstone.
                                                              - repair mechanisms exits
                                                                to ensure consistency
                                                                of cluster

ºDatacenterº:                 ºClusterº:
- Logical Set of Racks.       - 1+ Datacenters
- Ex: America, Europe,...     - full set of nodes
- Replication set at            which map to a single
  Data Center bounds            complete token ring

CQL
- similar syntax to SQL
- works with table data.
- Available Shells: cqlsh | DevCenter | JDBC/ODBC drivers

ºCoordinator Nodeº:
- Node where client connects for a read/write request
  acting as a proxy between the client and Cassandra
  internals.

Cassandra Key Components
 Gossip
    A peer-to-peer communication protocol in which nodes periodically exchange
state information about themselves and about other nodes they know about. This
is similar to hear-beat mechanism in HDFS to get the status of each node by the
master.

    A peer-to-peer communication protocol to share location and state
information about the other nodes in a Cassandra cluster. Gossip information is
also stored locally by each node to use immediately when a node restarts.

    The gossip process runs every second and exchanges state messages with up
to three other nodes in the cluster. The nodes exchange information about
themselves and about the other nodes that they have gossiped about, so all
nodes quickly learn about all other nodes in the cluster.

    A gossip message has a version associated with it, so that during a gossip
exchange, older information is overwritten with the most current state for a
particular node.

    To prevent problems in gossip communications, use the same list of seed
nodes for all nodes in a cluster. This is most critical the first time a node
starts up. By default, a node remembers other nodes it has gossiped with
between subsequent restarts. The seed node designation has no purpose other
than bootstrapping the gossip process for new nodes joining the cluster. Seed
nodes are not a single point of failure, nor do they have any other special
purpose in cluster operations beyond the bootstrapping of nodes.



Note: In multiple data-center clusters, the seed list should include at least
one node from each data center (replication group). More than a single seed
node per data center is recommended for fault tolerance. Otherwise, gossip has
to communicate with another data center when bootstrapping a node. Making every
node a seed node is not recommended because of increased maintenance and
reduced gossip performance. Gossip optimization is not critical, but it is
recommended to use a small seed list (approximately three nodes per data
center).
Data distribution and replication

How data is distributed and factors influencing replication.

In Cassandra, Data is organized by table and identified by a primary key, which
determines which node the data is stored on. Replicas are copies of rows.

Factors influencing replication include:
Virtual nodes (Vnodes):

Assigns data ownerommended for production. It defines a node’s data center
and rack and uses gossip for propagating this information to other nodes.

There are many vtypes of snitches like dynamic snitching, simple snitching,
RackInferringSnitch, PropertyFileSnitch, GossipingPropertyFileSnitch,
Ec2Snitch, Ec2MultiRegionSnitch, GoogleCloudSnitch, CloudstackSnitch.

TODO:
https://rene-ace.com/cassandra-101-understanding-what-is-cassandra/
- Seed node
- Snitch purpose
- topologies
- Coordinator node,
- replication factors,
- ...
Cassandra+Spark
@[https://es.slideshare.net/chbatey/1-dundee-cassandra-101]

Similar projects
- Voldermort: developed by Linked-In.
@[https://www.project-voldemort.com/voldemort/]

BºWhat's newº
- Cassandra 4.0 (2020-10)
@[https://cassandra.apache.org/doc/latest/new/]
  - Support for Java 11
  - Virtual Tables
  - Audit Logging
    - All successful/failed login attempts
    - All database command requests to CQL.
      - high performant live query logging 
      - useful for live traffic capture and traffic replay.
      - Chronicle-Queue used to rotate a log of queries
  - Full Query Logging (FQL)
  - Improved Internode Messaging
  - Improved Streaming
    - Streaming is the process used by nodes of a cluster to exchange data
      in the form of SSTables.
  - Transient Replication (experimental)


cstart Cassandra Orchestration Tool by Spotify @[https://github.com/spotify/cstar] @[www.infoq.com/news/2018/10/spotify-cstar] """ ... Cstar emerged from the necessity of running shell commands on all host in a Cassandra cluster .... Spotify fleet reached 3000 nodes. Example scripts were: - Clear all snapshots - Take a new snapshot (to allow a rollback) - Disable automated puppet runs - Stop the Cassandra process - Run puppet from a custom branch of our git repo in order to upgrade the package - Start the Cassandra process again - Update system.schema_columnfamilies to the JSON format - Run `nodetool upgradesstables`, which depending on the amount of data on the node, could take hours to complete - Remove the rollback snapshot """ $ pip3 install cstart REF: @[https://github.com/spotify/cstar] """Why not simply use Ansible or Fabric? Ansible does not have the primitives required to run things in a topology aware fashion. One could split the C* cluster into groups that can be safely executed in parallel and run one group at a time. But unless the job takes almost exactly the same amount of time to run on every host, such a solution would run with a significantly lower rate of parallelism, not to mention it would be kludgy enough to be unpleasant to work with. Unfortunately, Fabric is not thread safe, so the same type of limitations apply. All involved machines are assumed to be some sort of UNIX-like system like OS X or Linux with python3, and Bourne style shell."""
Dynamo vs Cassandra
https://sujithjay.com/data-systems/dynamo-cassandra/
Dynamo vs Cassandra : Systems Design of NoSQL Databases

State-of-the-art distributed databases represent a distillation of 
years of research in distributed systems. The concepts underlying any 
distributed system can thus be overwhelming to comprehend. This is 
truer when you are dealing with databases without the strong 
consistency guarantee. Databases without strong consistency 
guarantees come in a range of flavours; but they are bunched under a 
category called NoSQL databases.
Google Spanner
(BigTable subtitution)
https://cloud.google.com/spanner/
- Features:
  - Schema
  - SQL
  - Consistency
  - Availability
  - Scalability
  - Replication
Document Stores
- also called document-oriented DBMS
- schema-free:
  - different records may have
    different columns
  - values of individual columns
    can have dif. types
- Columns can be multi-value
- Records can have a nested structure
- Often use internal notations,
  mostly JSON.
- features:
  - secondary indexes in (JSON) objects

ºExamplesº
- MongoDB @[#mongo_summary]
  www.infoq.com/articles/Starting-With-MongoDB
  14 Things I Wish I’d Known When Starting with
  MongoDB
- Amazon DynamoDB
- Couchbase
- Cosmos DB
- CouchDB:
  - Note: https://pouchdb.com/
    "...PouchDB is an open-source JavaScript database inspired by 
      Apache CouchDB that is designed to run well within the browser.
      PouchDB was created to help web developers build applications that 
      work as well offline as they do online.
      It enables applications to store data locally while offline, then 
      synchronize it with CouchDB and compatible servers when the 
      application is back online, keeping the user's data in sync no matter 
      where they next login..."

Note from https://www.xplenty.com/blog/couchdb-vs-mongodb/
  - BºCAP theorem: CouchDB prioritizes availability, while MongoDB º
    Bºprioritizes consistency.º
  - Reviews: MongoDB seems to have somewhat better reviews than CouchDB.
MongoDB Summary
$ mongo                                  ← Launch mongo shell

# show dbs                               ← list all databases

# use db_name                            ← create/ login to a database

# db.createCollection('collection01')    ← create collection

# db.collection01.insert( [              ← Insert N documents
    { key1: "val11", key2:"val21", },      db.collection.insertOne()
    { key1: "val12", key2:"val22", },      db.collection.insertMany()
    { key1: "val13", key2:"val23", }
  ]);


# db.collection01.save(                  ← Upsert doc.
    { key1: "newVal", ... },
  );

# db.collection01.update()               ← Update 1+ docs. in collection based on
                                           matching document and based on multi option

# db.collection01.update(                ← updateOne()|updateMany()
     { key1: val11 },                    ← query to match
     { $set: { key2 : val2} },           ← update with 
     { multi: true}                      ← options
  );

# db.collection01                        ← Update single document.
  .findOneAndUpdate(
    filter, 
    update,
    options)                             ← Options:
                                           upsert=true|false
                                           returnNewDocument: true: return new doc
                                           (vs original).
                                           upsert==true ⅋⅋ returnNewDocument==false:
                                           → return null.

# db.collecti01.findOne( { _id  : 123 });← Find by ID
# db.collecti01.findOne( { key1 : val1});       by query 
# db.collecti01.findOne(                 ← Find with projection 
     { key1 : val1},                       (limiting the fields to return)
     { key2: 1}      );                    ←  returns id , key2 fields only


# db.collecti01.find( {...} )            ← Returns cursor with selected docs.

# db.collecti01.deleteOne( filter, opts) ← or deleteMany(filter, opts)


BºreadConcern levels:º  
- Control consistency and isolation properties of data reads 
  ( from replica sets ⅋ replica set shards ).

  - local: returns data from instace without guaranteing that
           its been written to majority of replica members
           (Default for reads against primary)
  - available: similar to local, gives lowest latancy for sharded collections.
           (Default for reads against secondaries)
  - majority: returns only if the data acknowledged by a majority of 
           the replica set members.
  -  linearizable: return data after all successful majority-acknowledged 
           writes
  - snapshot: Only available for transactions on multi documents.

BºWrite Concern:º
  - Control level of acknowledgment for a given write operation
    sent to mongod and mongos (for sharded collections)

  - fields spec:
    - w       : (number),
      0 : No ACK requested
      1 : (default) wait ACK from standalone mongod|primary(replica set)
      2+:           wait ACK from primary + given N-1 secondaries
    - j : (bool)    
      true =˃ return ACK only after real writte onto on-disk journal
    - wtimeout: millisecs before returning error.

RºWARNº: On failure MongoDB does not rollback the data, 
         data may eventually get stored

BºIndexesº
  _id Index (Default index, can NOT be dropped)
  1. Single Field:
  2. Multikey Field:
  3. Geospatial Index:
  4. Text Indexes:
  9. Aggregations
  1. Aggregation Pipeline
  2. Map-Reduce
  3. Single Purpose
Data(log) collect
ºFluentDº                                     |ºLogstashº
 "Improved" logstat                           |- TheºLº in ELK
 - https://www.fluentd.org/                   |-*OS data Collector
 - data collector for unified logging layer   |- TODO:
 - increasingly used Docker, GCP,             |  osquery.readthedocs.io/en/stable/
   and Elasticsearch communities              |- low-level instrumentation framework
 - https://logz.io/blog/fluentd-logstash      |  with system analytics and monitoring
   FluentD vs Logstash compared               |  both performant and intuitive.
ºFeaturesº                                    |- osquery exposes an operating system
 - unify data collection and consumption      |  as a high-performance SQL RDMS:
   for better understanding of data.          |  - SQL tables represent concepts such
                                              |    as running processes, loaded kernel
                                              |    modules, open network connections,
Syslog                      Elasticsearch     |    browser plugins, hardware events
Apache/Nginx logs    → → →  MongoDB           |ºFeatures:º
Mobile/Web app logs  → → →  Hadoop            |  - File Integrity Monitoring (FIM):
Sensors/IoT                 AWS, GCP, ...     |  - DNS
                                              |  - *1

*1:@[https://medium.com/palantir/osquery-across-the-enterprise-3c3c9d13ec55]
  @[https://github.com/tldr-pages/tldr/blob/master/pages/common/logstash.md]

|ºPrometheus Node ExporteRº                     |ºOthersº
|- https://github.com/prometheus/node_exporter  |- collectd
|- TODO:                                        |- Dynatrace OneAgent
                                                |- Datadog agent
                                                |- New Relic agent
                                                |- Ganglia gmond
                                                |- ...
Loki
@[https://grafana.com/loki]
- logging backend, optimized for Prometheus and Kubernetes
- optimized to search, visualize and explore your logs natively in Grafana.
Logreduce
IA filter
Logreduce IA filter
  - logreduce@pypi
  - Quiet log noise with Python and machine learning
Streams
Distributed (Event) Stream Processors
REF:
- Stream_processing@Wikipedia
- Patterns for streaming realtime anaylitics
- ???
- Streaming Reactive Systems & Data Pites w. Squbs
- Streaming SQL Foundations: Why I Love Streams+Tables 
- Next Steps in Stateful Streaming with Apache Flink
- Kafka Streams - from the Ground Up to the Cloud 
- Data Decisions with Real-Time Stream Processing
- Foundations of streamng SQL
- The Power of Distributed Snapshots in Apache Flink
- Panel: SQL over Streams, Ask the Experts
- Survival of the Fittest - Streaming Architectures
- Streaming for Personalization Datasets at Netflix
- < href="https://www.safaribooksonline.com/library/view/an-introduction-to/9781491934951/">An Introduction to Time Series with Team Apache 
""" Apache Cassandra evangelist Patrick McFadin shows how to solve time-series data
    problems with technologies from Team Apache: Kafka, Spark and Cassandra.
      - Kafka: handle real-time data feeds with this "message broker"
      - Spark: parallel processing framework that can quickly and efficiently
               analyze massive amounts of data
      - Spark Streaming: perform effective stream analysis by ingesting data
               in micro-batches
      - Cassandra: distributed database where scaling and uptime are critical
      - Cassandra Query Language (CQL): navigate create/update your data and data-models
      - Spark+Cassandra: perform expressive analytics over large volumes of data

|ºKafkaº@[./kafka_map.html]                             | ºSparkº (TODO)
|-@[https://kafka.apache.org]                           | @[http://spark.apache.org/]
|- scalable,persistent and fault-tolerant               | - Zeppelin "Spark Notebook"(video):
|  real-time log/event processing cluster.              | @[https://www.youtube.com/watch?v=CfhYFqNyjGc]
|- It's NOT a real stream processor  but a              |
|  broker/message-bus to store stream data              |
|  for comsuption.                                      |
|- "Kafka stream" can be use to add                     |
|  data-stream-processor capabilities                   |
|- Main use cases:                                      |
|  - real-time reliable streaming data                  | ºFlinkº TODO
|    pipelines between applications                     | - Includes a powerful windowing system
|  - real-time streaming applications                   |   supports many types of windows:
|    that transform or react to the                     |   - Stream windows and win.aggregations are crucial
|    streams of data                                    |     building block for analyzing data streams.
|- Each node-broker in the cluster has an               |
|  identity which can be used to find other             |
|  brokers in the cluster. The brokers also             |
|  need some type of a database to store                |
|  partition logs.                                      |
|-@[http://kafka.apache.org/uses]                       |
|  (Popular) Use cases                                  |
|  BºMessagingº: good replacement for traditional       |
|    message brokers decoupling processing from         |
|    data producers, buffering unprocessed messages,    |
|    ...) with better throughput, built-in              |
|    partitioning, replication, and fault-tolerance     |
|  BºWebsite Activity Trackingº(original use case)      |
|    Page views, searches, ... are published            |
|    to central topics with one topic per activity      |
|    type.                                              |
|  BºMetricsº operational monitoring data.              |
|  BºLog Aggregationº replacement.                      |
|    - Group logs from different servers in a central   |
|      place.                                           |
|    - Kafka abstracts away the details of files and    |
|      gives a cleaner abstraction of logs/events as    |
|      a stream of messages allowing for lower-latency  |
|      processing and easier support for multiple data  |
|      sources and distributed data consumption.        |
|      When compared to log-centric systems like        |
|      Scribe or Flume, Kafka offers equally good       |
|      performance, stronger durability guarantees due  |
|      to replication, and much lower end-to-end        |
|      latency.                                         |
|  BºStream Processingº: processing pipelines           |
|    consisting of multiple stages, where raw input     |
|    data is consumed from Kafka topics and then        |
|    aggregated/enriched/transformed into new           |
|    topics for further consumption.                    |
|    - Kafka 0.10.+ includes Kafka-Streams to           |
|      easify this pipeline process.                    |
|    - alternative open source stream processing        |
|      tools include Apache Storm and Samza.            |
|  BºEvent Sourcing architectureº: state changes        |
|    are logged as a time-ordered sequence of           |
|    records.                                           |
|  BºExternal Commit Log for distributed systemsº:      |
|    - The log helps replicate data between             |
|      nodes and acts as a re-syncing mechanism         |
|      for failed nodes to restore their data.          |

See also:
-Flink vs Spark Storm:
@[https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/]
Check also: Streaming Ledger. Stream ACID TXs on top of Flink

@[https://docs.confluent.io/current/schema-registry/index.html]
Confluent Schema Registry provides a serving layer for your metadata. It
provides a RESTful interface for storing and retrieving Apache Avro® schemas.
It stores a versioned history of all schemas based on a specified subject name
strategy, provides multiple compatibility settings and allows evolution of
schemas according to the configured compatibility settings and expanded Avro
support. It provides serializers that plug into Apache Kafka® clients that
handle schema storage and retrieval for Kafka messages that are sent in the
Avro format.


- LodDevice(Facebook)
@[https://www.infoq.com/news/2018/09/logdevice-distributed-logstorage]
  - LogDevice has been compared with other log storage systems
    like Apache BookKeeper and Apache Kafka.
  - The primary difference with Kafka
  (@[https://news.ycombinator.com/item?id=17975328]
    seems to be the decoupling of computation and storage
  - Underlying storage based on RocksDB, a key value store
    also open sourced by Facebook
Event Based
Architecture
@[https://www.infoq.com/news/2017/11/jonas-reactive-summit-keynote]

Jonas Boner ... talked about event driven services (EDA) and
event stream processing (ESP)... on distributed systems.

... background on EDA evolution over time:
- Tuxedo, Terracotta and Staged Event Driven Architecture (SEDA).

ºevents represent factsº
- Events drive autonomy in the system and help to reduce risk.
- increase loose coupling, scalability, resilience, and traceability.
- ... basically inverts the control flow in the system
- ... focus on the behavior of systems as opposed to the
  structure of systems.

- TIP for developers:
 ºDo not focus on just the "things" in the systemº
 º(Domain Objects), but rather focus on what happens (Events)º
- Promise Theory:
  - proposed by Mark Burgess
  - use events to define the Bounded Context through
    the lense of promises.

quoting Greg Young:
"""Modeling events forces you to have a temporal focus on what’s going on in the
system. Time becomes a crucial factor of the system."""
""" Event Logging allows us to model time by treating event as a snapshot in time
   and event log as our full history. It also allows for time travel in the
   sense that we can replay the log for historic debugging as well as for
auditing and traceability. We can replay it on system failures and for data replication."""

Boner discussed the following patterns for event driven architecture:
- Event Loop
- Event Stream
- Event Sourcing
- CQRS for temporal decoupling
- Event Stream Processing

Event stream processing technologies like Apache Flink, Spark Streaming,
Kafka Streams, Apache Gearpump and Apache Beam can be used to implement these
design patterns.

DDDesign, CQRS ⅋ Event Sourcing
Domain Driven Design, CQRS and Event Sourcing
@[https://www.kenneth-truyers.net/2013/12/05/introduction-to-domain-driven-design-cqrs-and-event-sourcing/]


Enterprise Patterns
ESB Architecture
- Can be defined by next feautes:
  - Monitoring of services/messages passed between them
  - wire Protocol bridge between HTTP, AMQP, SOAP, gRPC, CVS in Filesystem,...
  - Scheduling, mapping, QoS management, error handling, ..
  - Data transformation
  - Data pipelines
  - Mule, JBoss Fuse (Camel + "etc..."), BizTalk, Apache ServiceMix, ...
REF: https://en.wikipedia.org/wiki/Enterprise_service_bus#/media/File:ESB_Component_Hive.png
    ^   Special App. Services
    |
E   |   Process Automation                 BPEL, Workflow
n   |
t   |   Application Adapters               RFC, BABI, IDoc, XML-RPC, ...
e m |
r e |   Application Data Consolidation     MDM, OSCo, ...
p s |
r s |   Application Data Mapping           EDI, B2B
i a |   _______________________________
s g |   Business Application Monitoring
e e |   _______________________________
    |   Traffic Monitoring Cockpit
S c |
e h |   Special Message Services           Ex. Test Tools
r a |
v n |   Web Services                       WSDL, REST, CGI
i n |
c e |   Protocol Conversion                XML, XSL, DCOM, CORBA
e l |
    |   Message Consolidation              N.N (data locks, multi-submit,...)
B   |
u   |   Message Routing                    XI, WBI, BIZTALK, Seeburger
s   |
    |   Message Service                    MQ Series, MSMQ, ...
Apache NiFi: Data Routing
@[https://nifi.apache.org]
- no-code, .drag-and-drop Web-GUI to build pipelines for
  data route+transform (99% NiFi users will never see a line of code)
- Deployed as a standalone application  (vs framework, api or library)
- Support batch ETL but "prefers" data streams.
- Support for binary data and large files ("multi-GB" video files).
- Complex (and performant) transformations, enrichments and normalisations. 
- out of the box it supports:
  - Kafka, Elastic, HDFS, S3, Postgres, Mongo, etc.
  - Generic sources/endpoints: TCP, HTTP, IMAP, ...

- scalable directed graphs of data routing/transformation/system mediation logic.
- UI to control design, feedback, monitoring
- natively clustered - it expects (but isn't required) to be deployed on
  multiple hosts that work together as a cluster for 
  performance, availability and redundancy.

- Highly configurable:
  - Loss tolerant vs guaranteed delivery
  - Low latency vs high throughput
  - Dynamic prioritization
  - Flow can be modified at runtime
  - Back pressure
- Data Provenance
  - Track dataflow from beginning to end
- Designed for extension
  - Build your own processors and more
  - Enables rapid development and
    effective testing
- Secure
  - SSL, SSH, HTTPS, encrypted content, etc...
  - Multi-tenant authorization and internal
    authorization/policy management

- DevOps:
@[https://dzone.com/articles/setting-apache-nifi-on-docker-containers]
REFS:
- NiFi+Spark:
@[https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark"]

- Nifi "vs" Cammel: [comparative]
@[https://stackoverflow.com/questions/65625166/apache-camel-vs-apache-nifi]
Business Data charts
@[https://www.forbes.com/sites/bernardmarr/2017/07/20/the-7-best-data-visualization-tools-in-2017/#643a48726c30]

Apache Superset: [TODO]
@[https://www.phoronix.com/scan.php?page=news_item&px=Apache-Superset-Top-Level]
Superset is the project's big data visualization and business 
intelligence web solution. Apache Superset allows for big data 
exploration and visualization with data from a variety of databases 
ranging from SQLite and MySQL to Amazon Redshift, Google BigQuery, 
Snowflake, Oracle Database, IBM DB2, and a variety of other 
compatible data sources. 
Python based.

Superset is also cloud-native in the sense that it is flexible and 
lets you choose the:
- web server (Gunicorn, Nginx, Apache),
- metadata database engine (MySQL, Postgres, MariaDB, etc),
- message queue (Redis, RabbitMQ, SQS, etc),
- results backend (S3, Redis, Memcached, etc),
- caching layer (Memcached, Redis, etc),

Superset also works well with services like NewRelic, StatsD and 
DataDog.

Low-Level Data charts
|ºKibanaº (TODO)
|- "A Picture's Worth a Thousand Log Lines"
|- visualize (Elasticsearch) data and navigate
|  the Elastic Stack, learning  understanding
|  the impact rain might have on your quarterly
|  numbers

ºGrafanaº (TODO)
@[https://grafana.com/]
- time series analytics
- Integrates with Prometheus queries
@[https://logz.io/blog/grafana-vs-kibana/]
- 7.0+ includes new plugin architecture, visualizations,
  transformations, native trace support, ... 
https://grafana.com/blog/2020/05/18/grafana-v7.0-released-new-plugin-architecture-visualizations-transformations-native-trace-support-and-more/ 
TOGAF
@[https://en.wikipedia.org/wiki/The_Open_Group_Architecture_Framework]
The Open Group Architecture Framework (TOGAF) is a framework for enterprise
architecture that provides an approach for designing, planning, implementing,
and governing an enterprise information technology architecture.[2] TOGAF is
a high level approach to design. It is typically modeled at four levels:
Business, Application, Data, and Technology. It relies heavily on
modularization, standardization, and already existing, proven technologies and
products.

TOGAF was developed starting 1995 by The Open Group, based on DoD's TAFIM. As
of 2016, The Open Group claims that TOGAF is employed by 80% of Global 50
companies and 60% of Fortune 500 companies.[3]
Kogito
@[http://kie.org/]
- Kogito == Drools + jBPM + OptaPlanner
BºDroolsº
  - Drools is a business rule management system with a 
    forward-chaining and backward-chaining inference based rules engine, 
    allowing fast and reliable evaluation of business rules and complex 
    event processing. 
    Also use in other projects like https://www.shopizer.com/, and OOSS
    java ecommerce solution based on Spring/Hibernate/Drools/Elastic.

BºjBPMº
@[https://www.jbpm.org/]
  - toolkit for building business applications to help automate business processes and decisions.
  - jBPM originates from BPM (Business Process Management) but it has 
    evolved to enable users to pick their own path in business 
    automation. It provides various capabilities that simplify and 
    externalize business logic into reusable assets such as cases, 
    processes, decision tables and more.
    - business processes (BPMN2)
    - case management (BPMN2 and CMMN)
    - decision management (DMN)
    - business rules (DRL)
    - business optimisation (Solver)
  
  - jBPM can be used as standalone service or embedded in custom 
    service. It does not mandate any of the frameworks to be used, it can 
    be successfully used in
    - traditional JEE applications - war/ear deployments
    - SpringBoot, Thorntail, ... deployments
    - standalone java programs

BºOptaPlannerº [optimization]
- OptaPlanner is an AIºconstraint solverº. It optimizes planning and 
  scheduling problems, such as the Vehicle Routing Problem, Employee 
  Rostering, Maintenance Scheduling, Task Assignment, School 
  Timetabling, Cloud Optimization, Conference Scheduling, Job Shop 
  Scheduling, Bin Packing and many more. Every organization faces such 
  challenges: assign a limited set of constrained resources (employees, 
  assets, time and/or money) to provide products or services. 
  OptaPlanner delivers more efficient plans, which reduce costs and 
  improve service quality.
Messaging
Summary
- Messaging represents a layer down ESB. Messaging architectures
  focus on reliable delivery of messages of different nature/timelife.
  On top of the ESB add data management/processing/transformation.

- See also:
  “The many meanings of event-driven architecture,”
@[https://www.youtube.com/watch?v=STKCRSUsyP0]
  (By Martin Fowler)

- Messages classification by message validity:
 ┌────────────────────────────────────────────────┐
 │MESSAGE VALID           │ MESSAGIN SYSTEM NEEDS │
 │────────────────────────│───────────────────────│
 │for a short time        │ fast delivery         │
 │────────────────────────│───────────────────────│
 │until consumed.         │ persistence           │
 │────────────────────────│───────────────────────│
 │for repeated consumption│ persistence and index │
 └────────────────────────────────────────────────┘

- Message classification by semmantics:
  ┌─────────────────────────────────────────────┐
  │NATURE          │                            │
  │Command/Order   │·leads state change         │
  │                │·Requires response/result   │
  │────────────────│────────────────────────────│
  │Query           │·Requires response/result   │
  │                │ sync/async(fire and forget)│
  │────────────────│────────────────────────────│
  │Event           │·Doesn't require response   │
  │                │                            │
  └─────────────────────────────────────────────┘

   Ex messaging system:
   SYSTEM        BEST FOR     SEMANTICS  WORST FOR
  ┌─────────────────────────────────────────────────────┐
  │MQ "Classic   ·Durable     ·Command   ·Replayable    │
  │    Broker"   ·Volatile    ·event                    │
  │─────────────────────────────────────────────────────│
  │KAFKA         ·Replayable  ·Event     ·Volatile      │
  │              ·Scalability            ·light hardware│
  │─────────────────────────────────────────────────────│
  │Qpid          ·Volatile    ·Command   ·Durable       │
  │                           ·Query     ·Replayable    │
  └─────────────────────────────────────────────────────┘
 ┌───────────────────────────────────────────────────────────────────────────────────────────────┐
 │  Working   │ Description                │ PROS                      │ CONS                    │
 │  Model     │                            │                           │                         │
 │───────────────────────────────────────────────────────────────────────────────────────────────│
 │  queuing   │ pool of consumers may read │ allows to split processing│ queues aren't           │
 │            │ from server and each record│ over multiple consumers,  │ multi─subscriber—once   │
 │            │ goes to one of them        │ scaling with easy         │ one process reads the   │
 │            │                            │                           │ data it's gone.         │
 │───────────────────────────────────────────────────────────────────────────────────────────────│
 │  publish/  │ record broadcast           │ broadcast data to         │ no way of scaling       │
 │  subscribe │ to all consumers.          │ multiple processes        │ since every message     │
 │            │                            │                           │ goes to every subscriber│
 └───────────────────────────────────────────────────────────────────────────────────────────────┘


ºMessage Queuesº
Defined by
- message oriented architecture
- Persistence (or durability until comsuption)
- queuing
- Routing: point-to-point / publish-and-subscribe
- No processing/transformation of message/data

ºImplementations and standardsº
- AMQP
@[https://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol]
  - Open standard network protocol
  - Often compared to JMS:
    - JMS defines API interfaces, AMQP defines network protocol
    - JMS has no requirement for how messages are formed and
      transmitted and thus every JMS broker can implement the
      messages in a different (incompatible) format.
      AMQP publishes its specifications in a downloadable XML format,
      allowing library maintainers to generate APIs driven by
      the specs while also automating construction of algorithms to
      marshal and demarshal messages.
  - brokers implementations supporting it:
    RabbitMQ, ActiveMQ, Qpid, Solace, ...
    @[https://www.amqp.org/about/examples]
    - Apache Qpid (TODO)
    @[http://qpid.apache.org/]
    - INETCO's AMQP protocol analyzer
    @[http://www.inetco.com/resource-library/technology-amqp/]
    - JORAM  JMS + AMPQ
    @[http://joram.ow2.org/] 100% pure Java implementation of JMS
    - Kaazing's AMQPºWeb Clientº
    @[http://kaazing.net/index.html]
    - Azure Service Bus+ AMPQ
- What's wrong with AMQP?
@[https://news.ycombinator.com/item?id=1657574]
- @[http://www.windowsazure.com/en-us/develop/net/how-to-guides/service-bus-amqp-overview/]
- JBoss A-MQ, built from @[http://qpid.apache.org/]
@[http://www.redhat.com/en/technologies/jboss-middleware/amq]
- IBM MQLight
@[https://developer.ibm.com/messaging/mq-light/]
- StormMQ
@[http://stormmq.com/]
  a cloud hosted messaging service based on AMQP
- RabbitMQ (by VMware Inc)
@[http://www.rabbitmq.com/];
  also supported by SpringSource
...
- Java JMS
Message Brokers
- Routing
- (De-)Multiplexing of messages from/into multiple messages to different recipients
- Durability
- Transformation (translation of message between formats)
- "things usually get blurry - many solutions are both (message queue and message
  broker) - for example RabbitMQ or QDB.  Samples for message queues are
  Gearman, IronMQ, JMS, SQS or MSMQ."
  Message broker examples are, Qpid, Open AMQ or ActiveMQ.
- Kafka can also be used as message broker but is not its main intention
RabbitMQ
@[https://www.rabbitmq.com/getstarted.html]
- Multiprotocol (AMQP, ...)
  P: Producer, C: consumer

  P →  │││queue_name │││ ──→ C                                Work queues
                         └─→ C                               ------------------

            ┌→ │││queue1 │││ ──→ C
  P →  │X│ ─┤                                                 Publish/Subscribe
            └→ │││queue2 │││ ──→ C                            ------------------



            ┌→ (error) │││queue1 │││ ──→ C
  P →  │X│ ─┤                                                 Routing
        ^   │       info                                      -------
        |   └───────error──→ │││queue2 │││ ──→ C              
                    warning 
     type=direct


            ┌→ *.orange.* → │││queue1 │││ ──→ C
  P →  │X│ ─┤                                                 Topics
        ^   │                                                 ------
        |   └─ *.*.rabbit → │││queue2 │││ ──→ C 
               lazy.*
     type=topic



  C ─→ Request:            ──→ │││ request  │││ ──→  Server   Async RPC
  ^    reply_to=amqp.gen...                           │       ---------
  │    correlation_id=abc                             │
  │                                                   │
  └─── Reply               ─── │││ response │││ ──────┘ 
         correlation_id=abc

OpenHFT: microSec Messaging storing to disk
https://github.com/OpenHFT/
   - Chronicle-Queue: Micro second messaging that stores everything to disk
   - Chronicle-Accelerate: HFT meets Blockchain in Java platform
         XCL is a new cryptocurrency project that, learning from the previous
         Blockchain implementations, aims to solve the issues limiting adoption
         by building an entirely new protocol that can scale to millions of
         transactions per second, delivering consistent sub-second latency. Our
         platform will leverage AI to control volatility and liquidity, require
         low energy and simplify compliance with integrated KYC and AML support.

         The XCL platform combines low latencies (sub millisecond), IoT transaction
         rates (millions/s), open source AI volatility controls and blockchain for
         transfer of value and exchange of value for virtual fiat and crypto
         currencies. This system could be extended to other asset classes such as
         securities and fixed income. It uses a federated services model and
         regionalized payment systems making it more scalable than a blockchain
         which requires global consensus.

         The platform makes use of Chronicle-Salt for encryption and Chronicle-Bytes.

         on Chronicle Core’s direct memory and OS system call access.



   - Chronicle-Logger: A sub microsecond java logger, supporting standard logging
                      APIs such as Slf & Log4J

Distributed cache
(Extracted from whitepaper by Christoph Engelbert, Soft.Arch.at Hazelcast)
cache hit: data is already available in the cache when requested
           (otherwise it's said cache miss)

- Caches are implemented as simple key-value stores for performance.
- Caching-First: term to describe the situation where you start thinking
       about Caching itself as one of the main domains of your application.

ºUssageº
-ºReference Dataº
  - normally small and used to speed up the dereferencing
    of previously known, fixed number of elements (e.g.
    states of the USA, abbreviations of elements,...).
-ºActive DataSetº
- Grow to their maximum size and evict the oldest or not
  frequently used entries to keep in memory bounds.

Caching Strategies
ºCooperative (Distributed) cachingº
  different  cluster-nodes work together to build a huge, shared cache
  Ussually an "intelligent" partitioning algorithm is used to balance
  load about cluster nodes.
  - common approach when system requires large amounts of data to be cached

ºPartial Cachingº
- not all data is stored in the cache.

ºGeographical Cachingº
- located in chosen locations to optimize latency
- CDN (Content Delivery Network) is the best known example of this type of cache
- Works well when  content changes less often.

ºPreemptive Cachingº
- mostly used in conjunction with a Geographical Cache
- Using a warm-up engine a Preemptive Cache is populated on startup
  and tries to update itself based on rules or events.
- The idea behind this cache addition is to reload data from any
  backend service or central cluster even before a requestor wants
  to retrieve the element. This keeps access time to the cached
  elements constant and prevents accesses to single elements from
  becoming unexpectedly long.
- Can be difficult to implement properly and requires a lot of
  knowledge of the cached domain and the update workflows

ºLatency SLA Cachingº
- It's able to maintain latency SLAs even if the cache is slow
  or overloaded. This type of cache can be build in two different ways.
  - Having a timeout to exceed before the system either requests
    the potentially cached element from the original source
    (in parallel to the already running cache request) or simple
    default answer, using whatever returns first.
  - Always fire both requests in parallel and take whatever returns first.
    (discouraged since it mostly dimiss the value of caching). Can make
    sense if multiple caching layers are available.

Caching Topologies
-ºIn-process:º
 - cache share application's memory space.
 - most oftenly used in non-distributed systems.
 - fastest possible access speed.
 - Easy to build, but complex to grow.

-ºEmbedded Node Cachesº
- the application itself will be part of the cluster.
- kind of combination between an In-Process Cache and the
  Cooperative Caching
- it can either use partitioning or full dataset replication.

- CONST: Application and cache cannot be scaled independently

-ºClient-Server Cachesº
- these systems tend to be Cooperative Caches by having a
  multi-server architecture to scale out and have the
  same feature set as the Embedded Node Caches
  but with the client layer on top.
- This architecture keeps separate clusters of the applications
  using the cached data and the data itself, offering
  the possibility to scale the application cluster and the
  caching cluster independently.

Evict Strategies
-ºLeast Frequently usedº
 - values that are accessed the least amount of times are
   remove on memory preasure.
 - each cache record must keep track of its accesses using
   a counter which is increment only.

-ºLeast Recently Usedº
 - values that were last used most far back in terms of time
   are removed on memory preasure.
 - each record keeps must track of its last access timestamp
Other evict strategies can be found at
@[https://en.wikipedia.org/wiki/Cache_replacement_policies]

Memcached
@[https://www.memcached.org/
- distributed memory object caching system
- Memcached servers are unaware of each other. There is no crosstalk, no
  syncronization, no broadcasting, no replication. Adding servers increases
  the available memory. Cache invalidation is simplified, as clients delete
  or overwrite data on the server which owns it directly
- initially intended to speed up dynamic web applications alleviating database load

ºMemcached-session-managerº
@[https://github.com/magro/memcached-session-manager]
  tomcat HA/scalable/fault-tolerant session manager
- supports sticky and non-sticky configurations
- Failover is supported via migration of sessions
@[https://www.infoworld.com/article/3063161/why-redis-beats-memcached-for-caching.html]
Redis
- in-memory data structure store, used as a key-value database and cache
- Since it can also notify listener of changes in its state it can
  also be used as message broker (this is the case for example
  in Kubernetes, where etcd implement an asynchronous message system
  amongst its componentes).
- supports data structures such as strings, hashes, lists, sets, sorted sets
  with range queries, bitmaps, hyperloglogs and geospatial indexes with
  radius queries
- Redis has built-in replication, Lua scripting, LRU eviction, transactions
  and different levels of on-disk persistence, and provides high availability
  via Redis Sentinel and automatic partitioning with Redis Cluster
@[https://www.infoworld.com/article/3063161/why-redis-beats-memcached-for-caching.html]
@[https://www.infoq.com/news/2018/10/Redis-5-Released]

- Eval: Lua Scripts [TODO]
@[https://redis.io/commands/eval] 
Hazelcast
in-memory
data grid
@[https://en.wikipedia.org/wiki/Hazelcast]
-  based on Java
Ehcache
(terabyte)
cache
@[http://www.ehcache.org/]
- Can be used as tcp service (distributed cache) or process-embedded
  TODO:  Same API for local and distributed objects?
- open source, standards-based cache that boosts performance, offloads I/O
- Integrates with other popular libraries and frameworks
- It scales from in-process caching, all the way to mixed
  in-process/out-of-process deployments withºterabyte-sized cachesº

ºExample Ehcache 3 APIº:
CacheManager cacheManager =
  CacheManagerBuilder.newCacheManagerBuilder()
  .withCache("preConfigured",
    CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
      ResourcePoolsBuilder.heap(100))
  .build())
  .build(true);

Cache˂Long, String˃ preConfigured =
    = cacheManager.getCache("preConfigured", Long.class, String.class);

Cache˂Long, String˃ myCache = cacheManager.createCache("myCache",
    CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
                                  ResourcePoolsBuilder.heap(100)).build());

myCache.put(1L, "da one!");
String value = myCache.get(1L);

cacheManager.close();

(simpler/lighter solution but not so escalable could be to use Google Guava Cache)
JBoss Cache
@[http://jbosscache.jboss.org/]
Distributed Storage
SAN vs ...
@[https://serversuit.com/community/technical-tips/view/storage-area-network,-and-other-storage-methods.html]
Storage Area Network, and Other Storage Methods
Data Lake
@[https://en.wikipedia.org/wiki/Data_lake]
UStore!!!
@[https://arxiv.org/pdf/1702.02799.pdf]
Distributed Storage with rich semantics!!!

Today’s storage systems expose abstractions which are either  too  low-level
(e.g.,  key-value  store,  raw-blockstore)   that   they   require   developers
  to   re-invent   thewheels, or too high-level (e.g., relational databases,
Git)that they lack generality to support many classes of ap-plications.   In
this  work,  we  propose  and  implement  ageneral  distributed  data  storage
system,  called  UStore,which  has  rich  semantics.    UStore  delivers  three
 keyproperties,  namely  immutability,  sharing  and  security,which unify and
add values to many classes of today’sapplications, and which also open the
door for new ap-plications.   By  keeping  the  core  properties  within
thestorage, UStore helps reduce application development ef-forts while offering
high performance at hand. The stor-age embraces current hardware trends as key
enablers.It is built around a data-structure similar to that of Git,a  popular
source  code  versioning  system,  but  it  alsosynthesizes many designs from
distributed systems anddatabases.   Our  current  implementation  of  UStore
hasbetter  performance  than  general  in-memory  key-valuestorage systems,
especially for version scan operations.We  port  and  evaluate  four
applications  on  top  of  US-tore:  a Git-like application, a collaborative
data scienceapplication,  a transaction management application,  anda
blockchain application.   We demonstrate that UStoreenables  faster
development  and  the  UStore-backed  ap-plications can have better performance
than the existingimplementations

""" Our current implementation of UStore hasbetter performance than 
general in-memory key-valuestorage systems, especially for version 
scan operations.We port and evaluate four applications on top of 
US-tore: a Git-like application, a collaborative data 
scienceapplication, a transaction management application, anda 
blockchain application. We demonstrate that UStore enables faster 
development and the UStore-backed applications can have better 
performance than the existing implementations. """
NFS considered harmful
@[http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html]
cluster-FS comparative
@[http://zgp.org/linux-tists/20040101205016.E5998@shaitan.lightconsulting.com.html]
Ceph (exabyte Soft.Defined Storage)
@[http://ceph.com/ceph-storage/]
Ceph’s RADOS provides you with extraordinary data storage  scalability—
thousands of client hosts or KVMs accessing petabytes to
exabytes of data. Each one of your applications can use the object, block or
file system interfaces to the same RADOS cluster simultaneously, which means
your Ceph storage system serves as a  flexible foundation for all of your
data storage needs. You can use Ceph for free, and deploy it on economical
commodity hardware. Ceph is a  better way to store data.

By decoupling the namespace from the underlying hardware, object-based
storage systems enable you to build much larger storage clusters. You
can scale out object-based storage systems using economical commodity hardware
, and you can replace hardware easily when it malfunctions or fails.

Ceph’s CRUSH algorithm liberates storage clusters from the scalability and
performance limitations imposed by centralized data table mapping. It
replicates and re-balance data within the cluster dynamically—elminating this
tedious task for administrators, while delivering high-performance and
infinite scalability.

See more at: http://ceph.com/ceph-storage/#sthash.KNp2tGf5.dpuf

When Ceph is Not Enough
Josh Goldenhar, VP Product Marketing, Lightbits Labs
...  In this 30-minute webinar, we'll discuss the origins of Ceph and why 
  it's a great solution for highly scalable, capacity optimized storage 
  pools. You’ll learn how and where Ceph shines but also where its 
  architectural shortcomings make Ceph a sub-optimal choice for today's 
  high performance, scale-out databases and other key web-scale 
  software infrastructure solutions.
  Participants will learn:
  • The evolution of Ceph
  • Ceph applicability to infrastructures such as OpenStack, 
    OpenShift and other Kubernetes orchestration environments
Rº• Why Ceph can't meet the block storage challenges of modern, º
Rº  scale-out, distributed databases, analytics and AI/ML workloadsº
Rº• Where Cephs falls short on consistent latency responseº
  • Overcoming Ceph’s performance issues during rebuilds
Bº• How you can deploy high performance, low latency block storage inº
Bº  the same environments Ceph integrates with, alongside your Ceph º
Bº  deployment.º
GlusterFS
- Software defined distributed storage,
Tachyon memory-centric distributed FS
@[http://tachyon-project.org/index.html]
- memory-centric distributed file system enabling reliable file sharing at memory-speed
across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging
lineage information and using memory aggressively. Tachyon caches working set files in memory,
thereby avoiding going to disk to load datasets that are frequently read. This enables different
jobs/queries and frameworks to access cached files at memory speed.

Tachyon is Hadoop compatible. Existing Spark and MapReduce programs can run on top of it without
any code change. The project is open source (Apache License 2.0) and is deployed at multiple companies.
It has more than 40 contributors from over 15 institutions, including Yahoo, Intel, and Redhat.

The project is the storage layer of the Berkeley Data Analytics Stack (BDAS) and also part of the
Fedora distribution.
S3QL FUSE FS (Amazon S2, GCS, OpenStack,...)
- FUSE-based file system
- backed by several cloud storages:
  - such as Amazon S2, Google Cloud Storage, Rackspace CloudFiles, or OpenStack
@[http://xmodulo.com/2014/09/create-cloud-based-encrypted-file-system-linux.html]
- S3QL is one of the most popular open-source cloud-based file systems.
- full featured file system:
  - unlimited capacity
  - up to 2TB file sizes
  - compression
  - UNIX attributes
  - encryption
  - snapshots with copy-on-write
  - immutable trees
  - de-duplication
  - hardlink/symlink support, etc.
- Any bytes written to an S3QL file system are compressed/encrypted
  locally before being transmitted to cloud backend.
- When you attempt to read contents stored in an S3QL file system, the
  corresponding objects are downloaded from cloud (if not in the local
  cache), and decrypted/uncompressed on the fly.
Nexenta
@[https://nexenta.com/]
- Software-Defined Storage Product Family.
Minio.io
@[https://minio.io]
- Private Cloud Storage
- high performance distributed object storage server, designed for
  large-scale private cloud infrastructure. Minio is widely deployed across the
  world with over 146.6M+ docker pulls.
Why Object storage?
@[https://www.ibm.com/developerworks/library/l-nilfs-exofs/index.html]

Object storage is an interesting idea and makes for a much more scalable
system. It removes portions of the file system from the host and pushes them
into the storage subsystem. There are trade-offs here, but by distributing
portions of the file system to multiple endpoints, you distribute the workload,
making the object-based method simpler to scale to much larger storage systems.
Rather than the host operating system needing to worry about block-to-file
mapping, the storage device itself provides this mapping, allowing the host to
operate at the file level.

Object storage systems also provide the ability to query the available
metadata. This provides some additional advantages, because the search
capability can be distributed to the endpoint object systems.

Object storage has made a comeback recently in the area of cloud storage. Cloud
storage providers (which sell storage as a service) represent their storage as
objects instead of the traditional block API. These providers implement APIs
for object transfer, management, and metadata management.
Distributed Tracing
Zipkin
@[https://zipkin.io/]
Genesis Distributed Testing
@[https://docs.whiteblock.io/introduction_to_testing.html]
The following are types of tests one can perform on a distributed system:
- Functional Testing is conducted to test whether a system performs as it was 
  specified or in accordance with formal requirements
- Performance Testing tests the reliability and responsiveness of a system 
  under different types of conditions and scenarios
- Penetration Testing tests the system for security vulnerabilities
- End-to-End Testing is used to determine whether a system’s process flow 
  functions as expected
- Fuzzing is used to test how a system responds to unexpected, random, or 
  invalid data or inputs

Genesis is a versatile testing platform designed to automate the tests listed 
above, making it faster and simpler to conduct them on distributed systems 
where it was traditionally difficult to do so. Where Performance, End-to-End, 
and Functional testing comprise the meat of Genesis’ services, other types of 
testing are enabled through the deployment of services and sidecars on the 
platform.

End-to-End tests can be designed by applying exit code checks for process 
completion, success, or failure in tasks and phases, while Performance tests 
can be conducted by analyzing data from tests on Genesis that apply a variety 
of network conditions and combinations thereof. Functional tests can use a 
combination of tasks, phases, supplemental services and sidecars, and network 
conditions, among other tools.

These processes and tools are further described in this documentation.

AAA
keycloak
@[https://www.keycloak.org/]
@[https://medium.com/keycloak/keycloak-essentials-86254b2f1872]                   [TODO]
@[https://medium.com/keycloak/keycloak-realm-client-configuration-dfd7c8583489]

• See also @[../WebTechnologies/map.html#keycloak_angular_openid]

"""Add authentication to applications and secure services with minimum fuss"""
-  No need to deal with storing users or authenticating users.
- It's all available out of the box.
- Advanced features such as User Federation, Identity Brokering and Social Login.

   Single-Sign On                        LDAP and Active Directory
   Login once to multiple applications   Connect to existing user directories
   Standard Protocols                    Social Login
   OpenID Connect, OAuth 2.0             Easily enable social login
   and SAML 2.0

   Identity Brokering                    Clustering
   OpenID Connect or SAML 2.0 IdPs       For scalability and availability
   High Performance                      Themes
   Lightweight, fast and scalable        Customize look and feel

   Extensible
   Customize through code
   Password Policies
   Customize password policies


- AA using the Keycloak REST API
@[https://developers.redhat.com/blog/2020/11/24/authentication-and-authorization-using-the-keycloak-rest-api/]
@[https://github.com/edwin/java-keycloak-integration]
Ex Use Case:
- centralized ID platform for Indonesia’s Ministry of Education
  single sign-on:
  Identity Roles:
  - Students (IDs)
  - Teachers (IDs)
 - each user have different access and privileges at each school.

BºSETUP:º {
  → Creating Keycloak realm:
    → Add 'education' realm for ministry.
      (click enable and create)
      → go to 'Roles' page:
        → choose tab "Realm Roles".
          → Create two roles:
            - "teacher"
            - "student"
            → (Clients page) click Create to add a client:
              - create "jakarta-school" client and save
              → Go to jakarta-school details page → "Settings tab":
                - Enter next client config:
                  Client ID: jakarta-school
                  Enabled: ON
                  Consent Required: OFF
                BºClient Protocol: openid-connectº
                  Access Type: confidential
                BºStandard Flow Enabled: ONº
                  Impact Flow Enabled: OFF
                BºDirect Access Grants Enabled: ONº
                  Browser Flow: browser            (← Scroll to bottom)
                  Direct Grant Flow: direct grant
                → Go to tab "Roles" tab → Add next Roles:
                  - "create-student-grade"
                  - "view-student-grade"
                  - "view-student-profile"
                  → go to tab "Client Scopes" → Default Client Scopes and   
                    add “roles” and “profile” to Assigned Default Client
                    Scopes.
                    → go to "jakarta-school details" page → Mappers →
                      Create Protocol Mappers: 
                      - Set mappers to display the client roles on the
                        Userinfo API:
                        Name               : roles
                        Mapper Type        : User Realm Role
                        Multivalued        : ON
                        Token Claim Name   : roles
                        Claim JSON Type    : String
                        Add to ID token    : OFF
                        Add to access token: OFF
                        Add to userinfo    : ON
                      → go to "Users page" → "Add user".
                        - create the new users and Save. Ex:
                          Username      : edwin
                        BºEmail         : edwin@redhat.comº
                          First Name    : Edwin
                          Last Name     : M
                          User Enabled  : ON
                        BºEmail Verified: OFFº
                        → Go to tab "Role Mappings" → Client Roles 
                          (for each user in jakarta-school)
                          Check that roles perm. are enabled:
                          "create-student-grade"
                          "view-student-grade"
                          "view-student-profile".
}

BºSpring Boot Java App using Keycloak for AAº
  - pom.xml:
    ˂?xml version="1.0" encoding="UTF-8"?˃
    ...  
        ˂dependencies˃
        ...  
            ˂dependency˃
                ˂groupId˃com.auth0˂/groupId˃
                ˂artifactId˃jwks-rsa˂/artifactId˃
                ˂version˃0.12.0˂/version˃
            ˂/dependency˃
    
            ˂dependency˃
                ˂groupId˃com.auth0˂/groupId˃
                ˂artifactId˃java-jwt˂/artifactId˃
                ˂version˃3.8.3˂/version˃
            ˂/dependency˃
        ˂/dependencies˃
    ˂/project˃

  - Config. properties file:
    server.error.whitelabel.enabled=false
    spring.mvc.favicon.enabled=false
    server.port = 8082
    
    keycloak.client-id=jakarta-school
    keycloak.client-secret=197bc3b4-64b0-452f-9bdb-fcaea0988e90
    keycloak.scope=openid, profile
    keycloak.authorization-grant-type=password
    
    keycloak.authorization-uri=http://localhost:8080/auth/realms/education/protocol/openid-connect/auth
    keycloak.user-info-uri=http://localhost:8080/auth/realms/education/protocol/openid-connect/userinfo
    keycloak.token-uri=http://localhost:8080/auth/realms/education/protocol/openid-connect/token
    keycloak.logout=http://localhost:8080/auth/realms/education/protocol/openid-connect/logout
    keycloak.jwk-set-uri=http://localhost:8080/auth/realms/education/protocol/openid-connect/certs
    keycloak.certs-id=vdaec4Br3ZnRFtZN-pimK9v1eGd3gL2MHu8rQ6M5SiE
    
    logging.level.root=INFO
    
  - Config: Get the Keycloak public certificate ID from:
    Keycloak → Education → Keys → Active (with RSA-generated key selected).

  - Integration code: (several Keycloak APIs are involved).

    - logout API:
      @Value("${keycloak.logout}")
      private String keycloakLogout;

      public void logout(String refreshToken) throws Exception {
        MultiValueMap˂String, String˃ map = new LinkedMultiValueMap˂˃();
        map.add("client_id",clientId);
        map.add("client_secret",clientSecret);
        map.add("refresh_token",refreshToken); // ← RºWARNº: Use refresh 
                                               //   (vs access) token

        HttpEntity˂MultiValueMap˂String, String˃˃ request = new HttpEntity˂˃(map, null);
        restTemplate.postForObject(keycloakLogout, request, String.class);
      }

    - Check whether a bearer token is valid and active or not
      @Value("${keycloak.user-info-uri}")
      private String keycloakUserInfo;
      private String getUserInfo(String token) {
        MultiValueMap˂String, String˃ headers = new LinkedMultiValueMap˂˃();
        headers.add("Authorization", token);
        HttpEntity˂MultiValueMap˂String, String˃˃ request =
          new HttpEntity˂˃(null, headers);
        return restTemplate.postForObject(keycloakUserInfo, request, String.class);
      }

    - Authorization (== "role valid for a given API"?) alternatives:
      - Alt 1. determine what role a bearer token brings by verifying it 
               against Keycloak’s userinfo API
             RºDrawback:º multiple roundtrip request needed
        following response from Keycloak is expected:
        {
            "sub": "ef2cbe43-9748-40e5-aed9-fe981e3082d5",
          Oº"roles": [ "teacher" ],º
            "name": "Edwin M",
            "preferred_username": "edwin",
            "given_name": "Edwin",
            "family_name": "M"
        }

      - Alt 2: decode JWT bearer token and validate it.
        - code needs firt the public key used for signing the token:
          (It requires a round trip but can be cached)
          - Keycloak provides a JWKS endpoint. Ex:
            $ curl -L -X GET 'http://.../auth/realms/education/protocol/openid-connect/certs'
            { "keys": [
                {
                  "kid": "vdaec4Br3ZnRFtZN-pimK9v1eGd3gL2MHu8rQ6M5SiE", ← key id
                  "kty": "RSA",
                  "alg": "RS256",
                  "use": "sig",
                  "n": "4OPCc_LDhU6ADQj7cEgRei4....",← Oºpublic key usedº
                  "e": "AQAB"                          Oºfor decodingº
                }
              ] }
        - A sample decoded JWT token will look like:
          {
            "jti": "85edca8c-a4a6-4a4c-b8c0-356043e7ba7d",
            "exp": 1598079154,
            "nbf": 0,
            "iat": 1598078854,
            "iss": "http://localhost:8080/auth/realms/education",
            "sub": "ef2cbe43-9748-40e5-aed9-fe981e3082d5",
            "typ": "Bearer",
            "azp": "jakarta-school",
            "auth_time": 0,
            "session_state": "f8ab78f8-15ee-403d-8db7-7052a8647c65",
            "acr": "1",
            "realm_access": { "roles": [ "teacher" ] },
            "resource_access": {
              "jakarta-school": {
                "roles": [ "create-student-grade", "view-student-profile",
                  "view-student-grade" ]
              }
            },
            "scope": "profile",
            "name": "Edwin M",
            "preferred_username": "edwin",
            "given_name": "Edwin",
            "family_name": "M"
          }
          Ex JWT validation code:
          @GetMapping("/teacher")
          HashMap teacher(
              @RequestHeader("Authorization") String authHeader) {
                  try {
              DecodedJWT jwt = JWT.decode(
                                 authHeader.replace("Bearer", "").trim());
              Jwk jwk = jwtService.getJwk();
              Algorithm algorithm = Algorithm.RSA256(
                        (RSAPublicKey) jwk.getPublicKey(), null);
              algorithm.verify(jwt);                // ← Bºcheck signatureº

              List˂String˃ roles = ((List)jwt.getClaim("realm_access").
                                    asMap().get("roles"));
              if(!roles.contains("teacher"))        //  ← Bºcheck rolesº
                  throw new Exception("not a teacher role");

              Date expiryDate = jwt.getExpiresAt(); // ← Bºcheck expirationº
              if(expiryDate.before(new Date()))
                  throw new Exception("token is expired");

              return new HashMap() {{ put("role", "teacher"); }}; // validation OK
                  } catch (Exception e) {
              throw new RuntimeException(e);
                  }
          }


FreeIPA
@[https://www.reddit.com/r/linuxadmin/comments/apbjtc/freeipa_groups_and_linux_usernames/]
Non-classified
API Management
@[https://www.datamation.com/applications/top-api-management-tools.html]

Behaviour
Driven
Development
(BDD)
- Cucumber, ...
@[https://cucumber.io/docs/guides/10-minute-tutorial/]
Infra vs App Monitoring
BºInfrastructure Monitoring:º
  - Prometheus + Grafana
    (Alternatives include Monit, Datadog, Nagios, Zabbix, ...)

BºApplication Monitoring (End-to-End Request Tracing)º
  - Jaeger (OpenTracing compatible), Zipkin, New Relic
    (Alternatives include AppDynamics, Instana, ...) 
  -ºtrack operations inside and across different systemsº.
  - Check for example how an incomming request affected to the web
    server, database, app code, queue system, all presented along 
    a timeline.
  - Especially valuable in distributed systems.
  - It complements logs and metrics:
    - App.request metrics warns about latencies in remote requests
      while local traces just do at local isolated systems. 
    - Logs on each isolated system can explain why such system is "slow".
    - Infra. monitoring warn about scarcity of CPU/memory/storage resources.

BºLogging How Toº:
  → Start by adding logs and infra monitoring
    → add application monitoring:
      ( requires support from programming languages/libraires, developpers and devOps teams)
      - instrument code (Jaeger)
      - add tracing to infrastructure components 
        - load balancers
      → deploy App tracing system itself. 
        (Ex.: Jaeger server, ...)
 
Bº(Bulk) Log Managementº
  - Elastic Stack
    (Alternative include Graylog, Splunk, Papertrail, ...)

Jaeger (App.Monit)
@[https://www.jaegertracing.io/]
@[https://logz.io/blog/zipkin-vs-jaeger/]
- Support 
- Support for Open Tracing instrumentation libraries
  like @[https://github.com/opentracing-contrib].
- K8s Templates and Operators supported. [k8s]
  
- Jaeger addresses next problems:
  - distributed transaction monitoring.
  - performance and latency optimization.
  - root cause analysis.
  - service dependency analysis.
  - distributed context propagation.

  Jaeger Log Data Architecture Schema:
  (Based on OpenTracing)
@[https://www.jaegertracing.io/docs/1.21/architecture/]
@[https://github.com/opentracing/specification/blob/master/specification.md]

  ºSPANº                    ºTRACEº:
   (logical work-unit)       - data/execution path 
   ----------------            through system components
   - operation name            (sort of directed acyclic
   - start time                 graph of spans).
   - duration.
   - Spans may be nested
     and ordered to model 
     causal relationships.

     Unique ID → │A│
     (Context)    └─│B│
                  └─│C│─│D│
                  └········│E│
                   ^^^^^^^^
                   barrier waiting for 
                   │B│,│C│ results
     

     ········→ time line →··············· 
                                          ┐
     ├─────────────  SPAN A ────────────┤ │TRACE
       ├──── SPAN B ───┤                  │
       ├─ SPAN C ─┤                       │
                  ├─ SPAN D ─┤            │
                             ├─ SPAN E ─┤ │
                                          ┘
   
   
   
   

Components
  Jaeger Components Schema:

BºDATA INPUTº
  Application                             jaeger-collector
  └→ App Instrumetation                ┌→ (Go Lang)
     └→ OpenTracing API                ·  + memory-queue ·┐
        └→ Jaeger-client ···→ Jaeger ··┘                  · 
                              agent                       ·
                            (Go lang)                     ·
                                                          ·
                                           Data Store  ←··┘
                                          (Cassandra,
                                           Elastic Search,
                                           Kafka, Memory)
                                              ^  ^
                                              ·  ·
BºDATA QUERY (App. Monit)º                    ·  ·
                                              ·  ·
  Jaeger-UI    ←····→  jaeger-query ←·········┘  ·
  (Web React)          (GO lang)                 ·
   (Alt.1)                                       ·
                                                 ·
     Spark ······································┘
   (Alt.2) 

Disruptor
  Disruptor
  See also
  lmax-exchange.github.com

  When Disruptor is not good fit
GraphQL
GraphQL: Facebook-developed language that provides a powerful API to get only
the dataset you need in a single request, seamlessly combining data sources.

REF: https://www.redhat.com/en/topics/api/what-is-graphql
The purpose of this monorepo is to give the GraphQL Community:
    a to-specification official language service (see: API Docs)
    a comprehensive LSP server and CLI service for use with IDEs
    a codemirror mode
    a monaco mode (in the works)
    an example of how to use this ecosystem with GraphiQL.
    examples of how to implement or extend GraphiQL.


ApolloGraphQL
Simplify app development by combining APIs, databases, and microservices into a
single data graph that you can query with GraphQL
@[https://www.apollographql.com/]
Big Data
Ambari: Hadoop Cluster Provision
@[https://projects.apache.org/project.html?ambari]
Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple.
Spark
@[http://spark.apache.org/]
- General cluster computing framework 
- initially designed around the concept of Resilient Distributed Datasets (RDDs).
  - RDDs enable data reuse by persisting intermediate results in memory 
    - Use case 1: fast computations for iterative algorithms.
      - especially beneficial for work flows like machine learning 
        (Bºsame operation may be applied over and over againº)
    - Use case 2: clean and/or transform data.
- Speed
  - Run workloads 100x faster (compared to Hadoop?).
  - Apache Spark achieves high performance for both batch
    and streaming data, using a state-of-the-art DAG scheduler,
    a query optimizer, and a physical execution engine.

- Ease of Use
  - Write applications quickly in Java, Scala, Python, R, and SQL.
  - Spark offers over 80 high-level operators that make it easy to
    build parallel apps. And you can use it interactively from the
    Scala, Python, R, and SQL shells.
    - Example PythonBºDataFrame APIº.
      |Bºdfº=ºspark.read.jsonº("logs.json") ← automatic schema inference
      |Bºdfº.where("age ˃ 21")
      |    .select("name.first").show()


- Generality
  - Combine SQL, streaming, and complex analytics.
  - Spark powers a stack of libraries including SQL and DataFrames,
    MLlib for machine learning, GraphX, and Spark Streaming.
    You can combine these libraries seamlessly in the same application.

- Runs Everywhere
  - Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone,
    or in the cloud. It can access diverse data sources.

- You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN,
  on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra,
  Apache HBase, Apache Hive, and hundreds of other data sources.

- Common applications for Spark include real-time marketing campaigns, online
  product recommendations, cybersecurity analytics and machine log monitoring.


Apache Druid - Complementary to Spark, Druid can be used to accelerate OLAP queries in Spark. - Spark: indicated for BºNON-interactiveº latencies. - Druid: indicated for Bº interactiveº (sub-milisec) scenarios. - Use-case: - powering applications used by thousands of users where each query must return fast enough such that users can interactively explore through data. - Druid fully indexes all data, and Bºcan act as a middle layer between º BºSpark and final appº. BºTypical production setupº: data → Spark processing → Druid → Serve fast queries to clients More info at: https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani
Hadoop "vs" Spark
@[https://www.infoworld.com/article/3014440/big-data/five-things-you-need-to-know-about-hadoop-v-apache-spark.html]
Hadoop is essentially a distributed data infrastructure:
 -It distributes massive data collections across multiple nodes
  within a cluster of commodity servers
 -It also indexes and keeps track of that data, enabling
  big-data processing and analytics far more effectively
  than was possible previously.
Spark, on the other hand, is a data-processing tool that operates on those
distributed data collections; it doesn't do distributed storage.

You can use one without the other:
  - Hadoop includes not just a storage component, known as the
  Hadoop Distributed File System, but also a processing component called
  MapReduce, so you don't need Spark to get your processing done.
  - Conversely, you can also use Spark without Hadoop. Spark does not come with
  its own file management system, though, so it needs to be integrated with one
  - if not HDFS, then another cloud-based data platform. Spark was designed for
  Hadoop, however, so many agree they're better together.

Spark is generally a lot faster than MapReduce because of the way it processes
data. While MapReduce operates in steps, Spark operates on the whole data set
in one fell swoop:
   "The MapReduce workflow looks like this: read data from the cluster, perform
    an operation, write results to the cluster, read updated data from the
    cluster, perform next operation, write next results to the cluster, etc.,"
    explained Kirk Borne, principal data scientist at Booz Allen Hamilton.
    Spark, on the other hand, completes the full data analytics operations
    in-memory and in near real-time:
    "Read data from the cluster, perform all of the requisite analytic
    operations, write results to the cluster, done," Borne said.
Spark can be as much as 10 times faster than MapReduce for batch processing and
p to 100 times faster for in-memory analytics, he said.
  You may not need Spark's speed. MapReduce's processing style can be just fine
if your data operations and reporting requirements are mostly static and you
can wait for batch-mode processing. But if you need to do analytics on
streaming data, like from sensors on a factory floor, or have applications that
require multiple operations, you probably want to go with Spark.
 Most machine-learning algorithms, for example, require multiple operations.

Recovery: different, but still good.
Hadoop is naturally resilient to system faults or failures since data
are written to disk after every operation, but Spark has similar built-in
resiliency by virtue of the fact that its data objects are stored in something
called resilient distributed datasets distributed across the data cluster.
"These data objects can be stored in memory or on disks, and RDD provides full
recovery from faults or failures," Borne pointed out.
Hive
Data W.H. on top of Hadoop
@[https://projects.apache.org/project.html?hive]
- querying and managing utilities for large datasets
  residing in distributed storage built on top of Hadoop
- easy data extract/transform/load (ETL)
 * a mechanism to impose structure on a variety of data formats*
- access to files stored in Apache HDFS and other data storage
  systems such as Apache HBase (TM)

HiveMQ Broker: MQTT+Kafka for IoT
@[https://www.infoq.com/news/2019/04/hivemq-extension-kafka-mqtt/]

Data Science
Orange
@[https://orange.biolab.si/screenshots/]
Features:
- Interactive Data Visualization
- Visual Programming
- Student's friendly.
- Add-ons
- Python Anaconda Friendly
  $ conda config --add channels conda-forge
  $ conda install orange3
  $ conda install -c defaults pyqt=5 qt
- Python pip  Friendly
  $ pip install orange3
Example Architectures
observability: 
      loging
+ monitoring
+    tracing
Ex:ºFile Integrity Monitoring at scale: (RSA Conf)º
@[https://www.rsaconference.com/writable/presentations/file_upload/csv-r14-fim-and-system-call-auditing-at-scale-in-a-large-container-deployment.pdf]
Auditing log to gain insights at scale:

                         ┌─→ Pagerduty
            ┌─→ Grafana ─┼─→ Email
   Elastic  │            └─→ Slack
   Search  ─┼─→ Kibana
            │
            └─→ Pre─processing ─→ TensorFlow

Alt1:
  User   │ go-audit-                          User space
  land   │ container                             app
  ───────├─────  Netlink ───── Syscall iface ───────────
  Kernel │        socket           ^
         │          ^              |
                    └─  Kauditd ───┘
Loggin@Coinbase
@[https://www.infoq.com/news/2019/02/metrics-logging-coinbase]
Adidas Reference Architecture
Source @[https://adidas.github.io]

ºuser interfaceº

Working with the latest JavaScript technologies as well as building a stable
ecosystem is our goal. We are not only using one of many available frameworks,
we test them, we analyze our necessities and we choose what has the best
balance between productivity and developer satisfaction.

adidas is moving most of the applications from the desktop to the browser, and
we are improving every day to achieve it in the less time possible:
standardization, good practices, continuous deployment, automated tasks are
part of our daily work.

Our stack has changed a lot since the 2015. Back then we used code which is
better not to speak about. Now, we power our web applications with frameworks
like React and Vue and modern tooling. Our de facto tool for shipping these
applications is webpack.

We love how JavaScript is changing everyday and we encourage the developers to
use its latest features. TypeScript is also a good choice for developing
consistent, highly maintainable applications.


ºapiº
Apiary is our collaborative platform for designing, documenting and governing
API's. It helps us to adopt API first approach with templates and best
practices. Platform is strongly integrated with our API guidelines and helps us
to achieve consistency of all our API's. Apiary also speeds up and simplifies
the development and testing phase by providing API mock services and automation
of API testing.

Mashery is an API platform that delivers key API management capabilities needed
for successful digital transformation initiatives. Its full range of
capabilities includes API creation, packaging, testing, and management of your
API's and community of users. API security is provided via an embedded or
optional on-premise API gateway.

Runscope is our continuous monitoring platform for all our API's we are
exposing. It monitors uptime, performance and validates the data. It is tightly
integrated with Slack and Opsgenie in order to alert and create incident if
things are going wrong way. Using Runscope we can detect any issues with our
API's super fast and act accordingly to achieve highest possible SLA.

ºfast dataº
Apache Kafka is the core of our Fast Data Platform.
The more we work with it, the more versatile we perceived it was, covering more
use cases apart of event sourcing strategies.

It is designed to implement the pub-sub pattern in a very efficient way,
enabling Data Streaming cases, where multiple can easily subscribe to the same
stream of information. The fan-out is the pub-sub pattern is implemented in a
really efficient way, allowing to achieve high thought in the production and
the consumption part.

It also enables other capabilities like Data Extraction and Data Modeling, as
long as Stateful Event Processing. Together with Storm, Kafka Streams and the
Elastic stack, provides the perfect toolset to integrate our digital
applications in a modern self-service way.

ºacidº
Jenkins has become the de facto standard for continuous integration/continuous
delivery projects.
ACID is a 100% docker powered by Kubernetes that allows you to create a Jenkins
as a Service platform. ACID provides an easy way to generate an isolated
Jenkins environment per team while keeping a shared basic configuration and a
central management dashboard.
Updating and maintaining teams instances have never been so easy. Team freedom
provided out of the box, gitops disaster recovery capabilities and elastic
slaves allocation are the key pillars.

ºtestingº
SerenityBDD is our default for automation in test solution. Built on top of
Selenium and Cucumber, it enables us to write cleaner and more maintainable
automated acceptance and regression tests in a faster way. With archetypes
available for both frontend (web) and backend (API), it is almost
straightforward to be implemented in most projects.

Supporting a wide type of protocols and application types, Neoload is our tool
of choice for performance testing. It provides an easy to use interface with
powerful recording capabilities, allowing manual implementation as well if the
user chooses. Since it supports dockerized load generators, it makes it easy to
create a shared infrastructure and integrate it in our continuous integration
chain.

ºmonitoringº
From the monitoring perspective, we are using different tools trying to provide
internally end-to-end monitoring, for a better troubleshooting.

For system and alerting monitoring, we are using Prometheus, an open source
solution, easy for developers and completely integrated with Kubernetes. For
infrastructure monitoring we have Icinga with hundreds of custom scripts and
graphite as timeseries database. About APM, Instana is our solution, which
makes easy and fast to monitor the applications and identify bottlenecks. We
use Runscope to test and monitor our APIs with continuous schedule to give us
completely visibility into API problems.

Grafana is the visualization tool by default which is integrated with
Prometheus, Instana, Elasticsearch, Graphite, etc.

For logging, we are based on the ELK stack running on-premises and AWS
Kubernetes, with custom components for processing and alerting being able to
notify the problems to the teams.

As the goal of monitoring is the troubleshooting, all these tools have
integration with Opsgenie, where we have defined escalation policies for
alerting and notifications. The incidents can be communicated to the final
users via Statuspage.

ºmobileº
At mobile development, we run out of hybrid solutions. Native is where we live.

We work with the latest and most concise, expressive and safe languages
available at the moment for android and iOS mobile phones: Kotlin and Swift.

Our agile mobile development process requires big doses of automation to test,
build and distribute reliably and automatically every single change in the app.
We rely on tools such as Jenkins and Fastlane to automate those steps.
100% Serverless AWS arch
What a typical 100% Serverless Architecture looks like in AWS!
https://medium.com/serverless-transformation/what-a-typical-100-serverless-architecture-looks-like-in-aws-40f252cd0ecb 
Example Data Store Arch
- Mix data store model based on:
  - Document store:
    - MongoDB        : Store and manage objects directly as documents, keeping their structure intact.
                       Avoiding Object Relational Mapping in RDMS.
    - MongoDB-GridFS : storage system supports efficient querying and storage of binary files.
                       (videos streamed).
                       Avoid the need fo blob storage structures.
  - Graph database: (Hyper-relational ddbb)
    - Neo4j: Allows for complex queries for highly linked data, stored data as nodes and links.

  - Spring Data module repositories: used to "match" Java codebase with underlying data store.
    - Allows mapping java objects to MongoDB with minimal effort.  
Facebook
netconsole monit@scale
@[http://www.serverwatch.com/server-news/linuxcon-how-facebook-monitors-hundreds-of-thousands-of-servers-with-netconsole.html]
"""""" Facebook had a system in the past for monitoring
that used syslog-ng, but it was less than 60 percent reliable.
In contrast, Owens stated netconsole is highly scalable and can
handle enormous log volume with greater than 99.99 percent
reliability.  """""""
Badoo
20 billion Events/day
@[https://www.infoq.com/news/2019/08/badoo-20-billion-events-per-day/]

Kafka

Summary
External Links
- Documentation:
@[http://kafka.apache.org/documentation/]

- Clients: (Java,C/C--,Python,Go,...)
@[https://cwiki.apache.org/confluence/display/KAFKA/Clients]

- Official JAVA JavaDoc API:
 @[http://kafka.apache.org/11/javadoc/overview-summary.html]

- Papers, ppts, ecosystem, system tools, ...:
@[https://cwiki.apache.org/confluence/display/KAFKA/Index]

- Kafka Improvement Proposals (KIP)
@[https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals]

- Data pipelines with Kotlin+Kafka+(async)Akka
@[../JAVA/kotlin_map.html?query=data+pipelines+kafka]

- Cheatsheets:
@[https://lzone.de/cheat-sheet/Kafka]
@[https://ronnieroller.com/kafka/cheat-sheet]
@[https://abdulhadoop.wordpress.com/2017/11/06/kafka-command-cheat-sheet/]
@[https://github.com/Landoop/kafka-cheat-sheet]
Global Diagram
(one+ server in 1+ datacenters)

☞Bºbest practiceº: publishers must be unaware of underlying 
   LOG partitions and only specify a partition key:
   While LOG partitions are identifiable and can be sent 
   to directly, it is not recommended. Higher level 
   constructs are recommended.

┌────────┐              ┌──────────┐               ┌───────────┐
│PRODUCER│ 1 ←······→ 1 │TOPIC     │ 1 ←···→ 0...N │SUBSCRIBERS│
│        ··→ stream of··→(CATEGORY)│               │           │
└────────┘ OºRECORDsº   └──────────┘               └───────────┘
  ↑          ======       1                          1 ← Normally, 1←→1, but not necesarelly
  |        Oº·key       º ↑                          ↑       
  ·        Oº·value     º ·                          ·     
  ·        Oº·timeStamp º ·                          ·
  ·                       ·                          ·  GºCONSUMER GROUPº: cluster of consumers
  ·                       ·                          ↓    |  (vs single process consumer) º*1º
(optionally) producers    ·                          1    ↓
can wait for ACK.         ↓                          CONSUMER   |    CONSUMER       ←───┐
  ·                       1                          GROUP A    |    GROUP B            │
  ·   ┌───┬────────────────────┬──────────────────┐  ---------  | --------------------   
  ·   │LOG│                    │  ordered records │  Cli1 Cli2  | Cli3 Cli4 Cli5 Cli6   │
  |   ├───┘                    │  ─────────────── │   |    |    |  |    |    |    |      
  └···→┌Partitionº0ºreplica ┌1 │  0 1 2 3 4 5     │  ←┤···········←┘    ·    ·    ·     │
      │└Partitionº0ºreplica │2┐│  0 1 2 3 4 5     │   ·    ·    |       ·    ·    ·      
      │┌Partition 1 replica ├1││  0 1 2 3         │   ·    ·    |       ·    ·    ·     │
      │└Partition 1 replica │2┤│  0 1 2 3         │  ·+···←┤    |       ·    ·    ·      
      │┌Partitionº2ºreplica ├1││  ...             │   ·    |    |       ·    ·    ·     │
      │└Partitionº2ºreplica │2┤│                  │   |···←┘    |       ·    ·    ·      
      │┌Partition 3 replica ├1││ - Partitions are │  ←┴················←┘    ·    ·     │
      │└Partition 3 replica │2┤│ independent and  │             |            ·    ·      
      │┌Partitionº4ºreplica ├1││ grow at differn. │             |            ·    ·     │
      │└Partitionº4ºreplica │2┤│ rates.           │             |            ·    ·      
      │┌Partition 5 replica ├1││                  │             |            ·    ·     │
      │└Partition 5 replica │2┤│ - Records expire │  ←······················←┘    ·      
      │┌Partitionº6ºreplica └1││ and can not be   │             |                 ·     │
      │└Partitionº6ºreplica  2┘│ deleted manually │  ←···························←┘      
      └──↑───────────────────^─┴──────────────────┘ * num. of group instances           │
                      │      │    ←── offset ─→       must be <= # partitions            
                      │      └────────────┐   record.offset (sequential id)             │
┌─────────────────────┴─────────────────┐ │   uniquelly indentifies the record           
─ Partitions serve several purposes:      │   within the partition.                     │
  ─ allow scaling beyond a single server. │                                              
  ─ act as the ☞BºUNIT OF PARALLELISMº.   │  - aGºCONSUMER GROUPºis a view (state,  ┐   │
─ Partition.ºRETENTION POLICYº indicates  │    position, or offset) of full LOG.    │
  how much time records will be available │  - consumer groups enable different     ├───┘
  for compsumption before being discarded.│    apps to have a different view of the │
-Bºnumber of partitions in an event hubº  │    LOG, and to read independently at    │
 Bºdirectly relates to the number of      │    their own pace and with their own    ┘
 Bºconcurrent readers expected.º          │    offsets:
-OºTOTAL ORDER  of events is just guaran-º│  ☞Bºin a stream processing Arch,º
 Oºteed inside a partition.º              │   Bºeach downstream applicationº 
   Messages sent by a producer to a given │   Bºequates to a consumer groupº
   partition are guaran. to be appended   │  
   in the order they were sent.           │  
          ┌───────────────────────────────┘           
- Messages sent by a producer to a particular topic partition are guaranteed
  to be appended in the order they are sent.
  ┌───────┴────────┐
  PARTITION REPLICA: (fault tolerance mechanism) 
  └ Kafka allows producers to wait on acknowledgement so that a write 
    isn't considered complete until it is fully replicated and guaranteed 
    to persist even if the server written to fails, allowing to balance 
    replica consistency vs performance.
  └ Each partition has one replica server acting as leader" and 0+ 
    replica servers "followers". The leader handles all read 
    and write requests for the partition while the followers passively 
    replicate the leader. If the leader fails, one of the followers 
    replaces it.
  └ A server can be a leader for partition A and a follower for 
    partition B, providing better load balancing.
  └ For a topic with replication factor N, we will tolerate up to N-1 
    server failures without losing any records committed to the log.

º*1:º
 - Log partitions are (dnyamically) divided over consumer instances so 
   that each client instance is the exclusive consumer of a "fair share"
   of partitions at any point in time.
 - The consumer group generalizes the queue and publish-subscribe:
   - As with a queue the consumer group allows you to divide (scale) up
     processing over a collection of processes (the members of the consumer group).
   - As with publish-subscribe, Kafka allows you to broadcast messages to
     multiple consumer groups


data-schema Support
Apache AVRO format
Efficient compac way to store data in disk.
@[https://shravan-kuchkula.github.io/kafka-schemas/#understand-why-data-schemas-are-a-critical-part-of-a-real-world-stream-processing-application]

Kafka Schema Registry BºDZone Intro Summaryº @[https://dzone.com/articles/kafka-avro-serialization-and-the-schema-registry] by Jean-Paul Azar Confluent Schema Registry: - REST API for producers/consumers managing Avro Schemas: - store schemas for keys and values of Kafka records. - List schemas and schema-versions by subject. - Return schema by version or ID. - get the latest version of a schema. - Check if a given schema is compatible with a certain version. - Compatibility level include: - backward: data written with old schema readable with new one. - forward : data written with new schema is readable with old one - full : backward + forward - none : Schema is stored by not schema validation is disabled (Rºnot recommendedº). - configured Bºglobally or per subjectº. - Compatibility settings can be set to support Bºevolution of schemasº. - Kafka Avro serialization project provides serializers taking schemas as input? - Producers send the schema (unique) ID and consumers fetch (and cache) the full schema from the Schema Registry. - Producer will create a new Avro record (schema ID, data). Kafka Avro Serializer will register (and cache locally) the associated schema if needed, before serializing the record. - Schema Registry ensures that producer and consumer see compatible schemas and Bºautomatically "transform" between compatible schemas,º Bºtransforming payload via Avro Schema Evolutionº. BºSCHEMA EVOLUTIONº Scenario: - Avro schema modified after data has already been written to store with old schema version. OºIMPORTANTº: ☞ From Kafka perspective,Oºschema evolutionº happens only Oºduring deserialization at the consumer (read)º: If consumer’s schema is different from the producer’s schema, and they are compatible, the value or key is automatically modified during deserialization to conform to the consumer's read schema if possible. BºSchema Evolution: Allowed compatible Modification º - change/add field's default value. - add new field with default value. - remove existing field with default value. - change field's order attribute. - add/remove field-alias (RºWARN:º can break consumers depending on the alias). - change type → union-containing-original-type. BºBest Patternº - Provide a default value for fields in your schema. - Never change a field's data type. - Do NOT rename an existing field (use aliases instead). Ex: Original schema v1: { "namespace": "com.cloudurable.phonebook", "type": "record", "name": "Employee", "doc" : "...", "fields": [ {"name": "firstName", "type": "string" }, {"name": "nickName" , "type": ["null", "string"] , "default" : null}, {"name": "lastName" , "type": "string" }, {"name": "age" , "type": "int" , "default": -1 }, {"name": "emails" , "type": {"type" : "array", "items": "string"}, "default":[] }, {"name": "phoneNum" , "type": [ "null", { "type": "record", "name": "PhoneNum", "fields": [ {"name": "areaCode" , "type": "string"}, {"name": "countryCode", "type": "string", "default" : ""}, {"name": "prefix" , "type": "string"}, {"name": "number" , "type": "string"} ] } ] }, {"name": "status" , "default" :"SALARY", , "type": { "type": "enum", "name": "Status", "symbols" : ["RETIRED", "SALARY",...] } } ] } - Schema Version 2: "age" field, def. value -1, added | KAFKA LOG | Producer@v2 →|Employee@v2| ··→ consumer@v.1 ·····→ NoSQL Store | | ^ ^ | | age field removed 'age' missing | | @deserialization | | | | | | consumer@ver.2 ←..... NoSQL Store | | ^ ^ | | age set to -1 'age' missing BºRegistry REST API Ussageº └ POST New Schema $º$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \º $º$ --data '{"schema": "{\"type\": …}’ \ º $º$ http://localhost:8081/subjects/Employee/versions º └ List all of the schemas: $º$ curl -X GET http://localhost:8081/subjects º Using Java OkHttp client: package com.cloudurable.kafka.schema; import okhttp3.*; import java.io.IOException; public class SchemaMain { private final static MediaType SCHEMA_CONTENT = MediaType.parse("application/vnd.schemaregistry.v1+json"); private final static String EMPLOYEE_SCHEMA = "{ \"schema\": \"" ...; private final static String BASE_URL = "http://localhost:8081"; private final OkHttpClient client = new OkHttpClient(); private static newCall(Requeste request) { System.out.println(client.newCall(request).execute().body().string() ); } private static putAndDumpBody (final String URL, RequestBody BODY) { newCall(new Request.Builder().put(BODY).url(URL).build()); } private static postAndDumpBody(final String URL, RequestBody BODY) { newCall(new Request.Builder().post(BODY).url(URL).build()); } private static getAndDumpBody(final String URL) { request = new Request.Builder() .url(URL).build(); System.out.println(client.newCall(request). execute().body().string()); } public static void main(String... args) throws IOException { System.out.println(EMPLOYEE_SCHEMA); postAndDumpBody( // ← POST A NEW SCHEMA BASE_URL + "/subjects/Employee/versions", RequestBody.create( SCHEMA_CONTENT, EMPLOYEE_SCHEMA ) ); getAndDumpBody(BASE_URL + "/subjects"); // ← LIST ALL SCHEMAS getAndDumpBody(BASE_URL // ← SHOW ALL VERSIONS + "/subjects/Employee/versions/"); getAndDumpBody(BASE_URL // ← SHOW VERSION 2 OF EMPLOYEE + "/subjects/Employee/versions/2"); getAndDumpBody(BASE_URL // ← "SHOW SCHEMA WITH ID 3 + "/schemas/ids/3"); getAndDumpBody(BASE_URL // ← SHOW LATEST VERSION + "/subjects/Employee/versions/latest"); postAndDumpBody( // ← SCHEMA IS REGISTERED? BASE_URL + "/subjects/Employee", RequestBody.create( SCHEMA_CONTENT, EMPLOYEE_SCHEMA ) ); postAndDumpBody( // ← //TEST COMPATIBILITY BASE_URL + "/compatibility/subjects/Employee/versions/latest", RequestBody.create( SCHEMA_CONTENT, EMPLOYEE_SCHEMA ) ); getAndDumpBody(BASE_URL // ← TOP LEVEL CONFIG + "/config"); putAndDumpBody( // ← SET TOP LEVEL CONFIG VALs BASE_URL + "/config", // VALs :=none|backward| RequestBody.create(SCHEMA_CONTENT, forward|full "{\"compatibility\": \"none\"}" ); putAndDumpBody( // ← SET CONFIG FOR EMPLOYEE BASE_URL + "/config/Employee", // RequestBody.create(SCHEMA_CONTENT, "{\"compatibility\": \"backward\"}" ); } } BºRUNNING SCHEMA REGISTRYº $º$ CONFIG="etc/schema-registry/schema-registry.properties"º $º$ cat ${CONFIG} º $ºlisteners=http://0.0.0.0:8081 º $ºkafkastore.connection.url=localhost:2181 º $ºkafkastore.topic=_schemas º $ºdebug=false º $º$ .../bin/schema-registry-start ${CONFIG} º BºWriting Producers/Consumers with Avro Serializers/Sche.Regº └ start up the Sch.Reg. pointing to ZooKeeper(cluster). └ Configure gradle: plugins { id "com.commercehub.gradle.plugin.avro" version "0.9.0" } └─────────────┬──────────────────┘ // http://cloudurable.com/blog/avro/index.html // transform Avro type → Java class // Plugin supports: // - Avro schema files (.avsc) ("Kafka") // - Avro RPC IDL (.avdl) // $º$ gradle buildº ← generate java classesº group 'cloudurable' version '1.0-SNAPSHOT' apply plugin: 'java' sourceCompatibility = 1.8 dependencies { testCompile 'junit:junit:4.11' compile 'org.apache.kafka:kafka-clients:0.10.2.0' ← compile "org.apache.avro:avro:1.8.1" ← Avro lib compile 'io.confluent:kafka-avro-serializer:3.2.1' ← Avro Serializer compile 'com.squareup.okhttp3:okhttp:3.7.0' } repositories { jcenter() mavenCentral() maven { url "http://packages.confluent.io/maven/" } } avro { createSetters = false fieldVisibility = "PRIVATE" } └ Setup producer to use GºSchema Registryº and BºKafkaAvroSerializerº package com.cloudurable.kafka.schema; import com.cloudurable.phonebook.Employee; import com.cloudurable.phonebook.PhoneNum; import io.confluent.kafka.serializers.KafkaAvroSerializerConfig; import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.Producer; import org.apache.kafka.clients.producer.ProducerConfig; import org.apache.kafka.clients.producer.ProducerRecord; import org.apache.kafka.common.serialization.LongSerializer; import io.confluent.kafka.serializers.KafkaAvroSerializer; import java.util.Properties; import java.util.stream.IntStream; public class AvroProducer { private static Producer˂Long, Employee˃ createProducer() { final String serClassName = LongSerializer.class.getName(); KafkaAvroClN = Serializer .class.getName(); SCHEMA_REG_URL_CONFIG = KafkaAvroSerializerConfig. SCHEMA_REGISTRY_URL_CONFIG; VAL_SERI_CLASS_CONFIG = ProducerConfig. VALUE_SERIALIZER_CLASS_CONFIG final Properties props = new Properties(); props.put(ProducerConfig. BOOTSTRAP_SERVERS_CONFIG , "localhost:9092"); props.put(ProducerConfig. CLIENT_ID_CONFIG , "AvroProducer" ); props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG , serClassName ); Bºprops.put( VAL_SERI_CLASS_CONFIG , KafkaAvroClN );º Bº// └────────────────────────┬──────────────────────────────┘ º Bº// CONFIGURE KafkaAvroSerializer. º Gºprops.put( SCHEMA_REG_URL_CONFIG , º // ← Set Schema Reg. Gº "http://localhost:8081");º URL return new KafkaProducer˂˃(props); } private final static String TOPIC = "new-employees"; public static void main(String... args) { Producer˂Long, Employee˃ producer = createProducer(); Employee bob = Employee.newBuilder().setAge(35) .setFirstName("Bob").set...().build(); IntStream.range(1, 100).forEach(index->{ producer.send(new ProducerRecord˂˃(TOPIC, 1L * index, bob)); }); producer.flush(); producer.close(); } } └ Setup consumer to use GºSchema Registryº and BºKafkaAvroSerializerº package com.cloudurable.kafka.schema; import com.cloudurable.phonebook.Employee; import io.confluent.kafka.serializers.KafkaAvroDeserializer; import io.confluent.kafka.serializers.KafkaAvroDeserializerConfig; import org.apache.kafka.clients.consumer.Consumer; import org.apache.kafka.clients.consumer.ConsumerConfig; import org.apache.kafka.clients.consumer.ConsumerRecords; import org.apache.kafka.clients.consumer.KafkaConsumer; import org.apache.kafka.common.serialization.LongDeserializer; import java.util.Collections; import java.util.Properties; import java.util.stream.IntStream; public class AvroConsumer { private final static String BOOTSTRAP_SERVERS = "localhost:9092"; private final static String TOPIC = "new-employees"; private static Consumer createConsumer() { Properties props = new Properties(); props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS); props.put(ConsumerConfig.GROUP_ID_CONFIG, "KafkaExampleAvroConsumer"); props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName()); //USE Kafka Avro Deserializer. props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class.getName()); //Use Specific Record or else you get Avro GenericRecord. props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, "true"); //Schema registry location. props.put(KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"); //<----- Run Schema Registry on 8081 return new KafkaConsumer<>(props); } public static void main(String... args) { final Consumer consumer = createConsumer(); consumer.subscribe(Collections.singletonList(TOPIC)); IntStream.range(1, 100).forEach(index -> { final ConsumerRecords records = consumer.poll(100); if (records.count() == 0) { System.out.println("None found"); } else records.forEach(record -> { Employee employeeRecord = record.value(); System.out.printf("%s %d %d %s \n", record.topic(), record.partition(), record.offset(), employeeRecord); }); }); } } Notice that just like with the producer, we have to tell the consumer where to find the Registry, and we have to configure the Kafka Avro Deserializer. Configuring Schema Registry for the consumer: //Use Kafka Avro Deserializer. props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class.getName()); //Use Specific Record or else you get Avro GenericRecord. props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, "true"); //Schema registry location. props.put(KafkaAvroDeserializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081"); //<----- Run Schema Registry on 8081 An additional step is that we have to tell it to use the generated version of the Employee object. If we did not, then it would use the Avro GenericRecord instead of our generated Employee object, which is a SpecificRecord. To learn more about using GenericRecord and generating code from Avro, read the Avro Kafka tutorial as it has examples of both. https://docs.confluent.io/current/schema-registry/index.html https://docs.confluent.io/current/schema-registry/schema_registry_tutorial.html
APIs
API Summary
@[https://medium.com/@stephane.maarek/the-kafka-api-battle-producer-vs-consumer-vs-kafka-connect-vs-kafka-streams-vs-ksql-ef584274c1e]
┌──────────────────────────────────────────────────────────────────┐
│   ┌───────────────┐                           ┌──────────────┐   │
│   │Schema Registry│                           │Control Center│   │
│   └───────────────┘                           └──────────────┘   │
│                                                                  │
│                                                                  │
│     Kafka 2 ←·· Replicator ┐    ┌.. REST                         │
│                            ·    ·   Proxy                        │
│  Event                     ·    ·                                │
│  Source                  ┌─v────v──┐                             │
│ (IoT,log, ─→ Producer ───→         ───→ Consumer ──→ "real time" │
│  ...)                    │ Kafka 1 │                  action     │
│                          │         │                             │
│                          │         │                             │
│                          │         │                             │
│  Data                    │         │                 Target DDBB,│
│  Source   ─→ Connect  ───→         ───→  Connect ──→ S3,HDFS, SQL│
│ (DDBB,       Source      │         │     Sink        MongoDB,... │
│  csv,...)                └─^────^──┘                             │
│                            │    │                                │
│                            │   ┌───────────┐                     │
│                        Streams │KSQL Server│                     │
│                        API     └───────────┘                     │
└──────────────────────────────────────────────────────────────────┘
  APIs            Ussage Context
  ──────────────  ───────────────────────────────────────────────────────
  Producer        Apps directly injecting data into Kafka
  ──────────────  ───────────────────────────────────────────────────────
  Connect Source  Apps inject data into CSV,DDBB,... Conn.Src API inject 
                  such data into Kafka.
  ──────────────  ───────────────────────────────────────────────────────
  Streams/KSQL    Apps consuming from Kafka topics and injecting back
                  into Kafka:
                  - KSQL   : SQL declarative syntax
                  - Streams: "Complex logic" in programmatic java/...
  ──────────────  ───────────────────────────────────────────────────────
  Consumer        Apps consuming a stream,  and perform "real-time" action
                  on it (e.g. send email...)
  ──────────────  ───────────────────────────────────────────────────────
  Connect Sink    Read a stream and store it into a target store 
  ──────────────  ───────────────────────────────────────────────────────

Producer API:
└ Bºextremely simple to useº: send data and Wait in callback.
└ RºLot of custom code for ETL alike appsº:
    - How to track the source offsets? 
      (how to properly resume your producer in case of errors)
    - How to distribute load for your ETL across many producers?
    ( Kafka Connect Source API recommended in those cases)

Connect Source API:
└ High level API built on top of the Producer API for:
  -  producer tasks Bºdistribution for parallel processingº
  -Bºeasy mechanism to resume producersº
└Bº"Lot" of available connectorsº out of the box (zero-code).

Consumer API:
└ BºKISS APIº: It uses Consumer Groups. Topics can be consumed in parallel.
             RºCare must be put in offset management and commits, as wellº
             Rºas rebalances and idempotence constraints, they’re really º
             Rºeasy to write.                                            º
             BºPerfect for stateless workloadsº (notifications,...)
└ RºLot of custom code for ETL alike appsº:

Connect Sink API:
└  built on top of the consumer API.
└Bº"Lot" of available connectorsº out of the box (zero-code).

Streams API:
└ Support for Java and Scala.
└ It enables to write either:
  -BºHigh Level DSLº(ApacheºSpark alikeº)
  -  Low Level API  (ApacheºStorm alikeº).
└ Complicated Coding is still required, but producers/consumers handling
  completely is hidden.
Bºfocussing on stream logicº
└BºSupport for 'joins', 'aggregations', 'exactly-once' semmantics.º
└Rºunit test can be difficultº (test-utils library to the rescue).
└ Use of state stores, when backed up by Kafka topics, will force
Rºprocessing a lot more messagesº, butBºadding support for resilient appsº.


KSQL :
└ ºwrapper on top of Kafka Streamsº.
└  It abstract away Stream coding complexity.
└RºNot support for complex transformations, "exploding" arrays,...º
   (as of 2019-11, gaps can be filled)
Message Delivery Semantics
@[http://kafka.apache.org/documentation.html#semantics]
producer
@[http://kafka.apache.org/11/javadoc/index.html?org/apache/kafka/clients/producer/kafkaproducer.html]
- send streams of data to topics in the cluster.
- thread safe → sharing a singleton across threads will generally be faster. 

- ex:
     import org.apache.kafka.clients.producer.KafkaProducer;
     // pre-setup:
     final properties kafkaconf = new Properties();
     kafkaconf.put("bootstrap.servers", "localhost:9092");
     kafkaconf.put("acks"             , "all");  ← "all" : slowest/safest setting: block until
                                                   full commit of the record (in all partitions?)

     // set how to turn key|value ProducerRecord instances into bytes.
     kafkaconf.put("key.serializer"  , "org.apache.kafka.common.serialization.StringSerializer");
     kafkaconf.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
     producer producer = new kafkaproducer<>(kafkaconf);

     producer.send(                        // ← sending-topics is async: (added to buffer 
         new ProducerRecord(    of pending records)
             "my-topic",
             integer.tostring(i),
             integer.tostring(i)
         )
     );
     producer.close(); // ← leak of resources if not done
     
    producer: 
    · pool of buffer space holding records not yet transmitted for each partition
      of size (config) batch.size

  + · background i/o thread turning
      "record"  to → (batch) network request.
       ------------------------
        automatically retry unless 
        config.retries == 0
      RºWARNº: config.retries ˃ 0 opens up the possibility of duplicates .
               Kafka0.11+ idempotent mode avoid the risk
               -"enable.idempotence" setting - 
             RºWARNº: avoid application level re-sends in this case.
             See more details in official doc.

    tunning º"linger.ms"º tells producer to wait up to that number of milliseconds
    before sending a request in hope that more records will arrive to fill up the
    same batch. This can increase performance by introducing a determenistic delay
    of at-least "linger.ms".
  Bº(similar to nagle's algorithm in tcp)º.


  - Kafka 0.11+ transactional producer allows clients to send messages to multiple
    partitions and topics! atomically (producer.beginTransaction, producer.send, 
    producer.commit, producer.abortTransaction ).

config see full list @ @[http://kafka.apache.org/documentation/#producerconfigs] -ºcleanup.policyº : "delete"* or "compact" retention policy for old log segments. -ºcompression.typeº : 'gzip', 'snappy', lz4, 'uncompressed' -ºindex.interval.bytesº: how frequently kafka adds an index entry to it's offset index. -ºleader.replication.throttled.replicasº: ex: [partitionid]º:[brokerid],[partitionid]:[brokerid]:... or wildcard '*' can be used to throttle all replicas for this topic. -ºmax.message.bytesº : largest record batch size allowed -ºmessage.format.versionº: valid apiversion ( 0.8.2, 0.9.0.0, 0.10.0, ...) -ºmessage.timestamp º : max dif allowed between the timestamp when broker receives º.difference.max.msº a message and timestamp specified in message. -ºmessage.timestamp.typeº: "createtime"|"logappendtime" -ºmin.insync.replicasº : all|-1|"n": number of replicas that must acknowledge a write for it to be considered successful. -ºretention.msº : time we will retain a log before old log segments are discarded. bºsla on how soon consumers must read their dataº.
Consumer
@[http://kafka.apache.org/26/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html]
@[http://kafka.apache.org/11/javadoc/index.html?org/apache/kafka/streams/KafkaStreams.html]

Class KafkaConsumer˂K,V˃

- client consuming records from cluster.
- transparently handles failure of Kafka brokers,
- transparently adapts as topic partitions it fetches migrate within the cluster.
- This client also interacts with the broker to 
 ºallow consumer-groups to load balance consumption º.
- consumer Rºis not thread-safeº.
- Compatible with brokers ver. 0.10.0+.

Basics:
- Consumer  Position: offset of the next record that will be given out.
- Committed position: last offset stored securely. Should the process
                      fail and restart, this is the offset consumers will
                      recover to.
  - Consumer can either automatically commit offsets periodically; or
    commit it manually.(commitSync, commitAsync).

- Consumer Groups and Topic Subscriptions:
  - consumer groups: consumer instances sharing same group.id creating a 
                     pool of processes (same or remote machines) to split
                     the processing of records.
  - Each consumer in a group can dynamically set the list of topics it 
    wants to subscribe to through one of the subscribe APIs. 
  - Kafka will deliver each message in the subscribed topics to one process
    in each consumer group by balancing partitions between all members so that
  Bºeach partition is assigned to exactly one consumer in the groupº. 
    (It makes no sense to have more consumers that partititions)
    Group rebalancing (map from partitions to consumer in group) occurs when:
    - Consumer is added/removed/not-available*1 to pool.
      *1 liveness detection time defined in "max.poll.interval.ms".
    - partition is added/removed from cluster.
    - new topic matching a subscribed regex is created.
    (Consumer groups just allow to have independent parallel multiprocess
     clients acting as a same app with no need to manual synchronization).
     (additional consumers are actually quite cheap).

    (See official doc for advanced topics on balancing consumer groups)


  - Simple Example Consumer: let Kafka dynamically assign a fair share of
                             the partitions for subscribed-to topics
    (See original doc for manual choosing partitions to consume from)
    Properties configProps = new Properties();
    configProps.setProperty("bootstrap.servers", "localhost:9092");
    configProps.setProperty("group.id", "test");
    configProps.setProperty("enable.auto.commit", "true");
    configProps.setProperty("auto.commit.interval.ms", "1000");
    configProps.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    props.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    KafkaConsumer˂String, String˃ consumer = new KafkaConsumer˂˃(configProps);
    consumer.subscribe(Arrays.asList("foo", "bar"));
    while (true) {
        ConsumerRecords˂String, String˃ records = consumer.poll(Duration.ofMillis(100));
        for (ConsumerRecord˂String, String˃ record : records)
            System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
           if (isManualCommit /*enable.auto.commit == false*/) {
               ... ;  consumer.commitSync();  ...
           }
    

    Use consumer.seek(TopicPartition, long) to skip up to some record.
    Use seekToBeginning(Collection)/seekToEnd(Collection) to go to start/end of partition. 


    See original doc for how to read Transactional Messages.
    See original doc for Multi-threaded Processing.

Config @[http://kafka.apache.org/documentation/#consumerconfigs] @[http://kafka.apache.org/documentation/#newconsumerconfigs] @[http://kafka.apache.org/documentation/#oldconsumerconfigs]
Connect
@[http://kafka.apache.org/11/javadoc/index.html?overview-summary.html]
(more info)
- allows reusable producers or consumers that connect Kafka
  topics to existing applications or data systems.

Connectors List @[https://docs.confluent.io/current/connect/connectors.html] @[https://docs.confluent.io/current/connect/managing/connectors.html] - Kafka connectors@github @[https://github.com/search?q=kafka+connector]
-BºHTTP Sinkº -BºFileStreamºs (Development and Testing) -BºGitHub Sourceº -BºJDBC (Source and Sink)º - PostgresSQL Source (Debezium) - SQL Server Source (Debezium) -BºSyslog Sourceº - AWS|Azure|GCD|Salesforce "*" - ...
Config @[http://kafka.apache.org/documentation/#connectconfigs]
AdminClient
@[http://kafka.apache.org/11/javadoc/index.html?org/apache/kafka/clients/admin/AdminClient.html]

- administrative client for Kafka, which supports managing and inspecting
  topics, brokers, configurations and ACLs.

Config @[http://kafka.apache.org/documentation/#adminclientconfigs]
Streams
@[http://kafka.apache.org/25/documentation/streams/]
See also:
@[http://kafka.apache.org/11/javadoc/index.html?org/apache/kafka/streams/KafkaStreams.html]
- Built on top o producer/consumer API.

- simple lightweight embedableBºclient libraryº, with support for
  real-time querying of app state with low level Processor API
  primitives plus high-level DSL.

- transparent load balancing of multiple instances of an application.
  using Kafka partitioning model to horizontally scale processing
  while maintaining strong ordering guarantees.

- Supports fault-tolerant local state, which enables very fast and 
 efficient stateful operations likeºwindowed joins and aggregationsº.

- Supports exactly-once processing semantics when there is a 
  client|Kafka failure.

- Employs one-record-at-a-time processing to achieve millisecond 
  processing latency, and supports event-time based windowing 
  operations with out-of-order arrival of records.

  import org.apache.kafka.common.serialization.Serdes;
  import org.apache.kafka.common.utils.Bytes;
  import org.apache.kafka.streams.KafkaStreams;
  import org.apache.kafka.streams.StreamsBuilder;
  import org.apache.kafka.streams.StreamsConfig;
  import org.apache.kafka.streams.kstream.KStream;
  import org.apache.kafka.streams.kstream.KTable;
  import org.apache.kafka.streams.kstream.Materialized;
  import org.apache.kafka.streams.kstream.Produced;
  import org.apache.kafka.streams.state.KeyValueStore;
   
  import java.util.Arrays;
  import java.util.Properties;
   
  public class WordCountApplication {
   
    public static void main(final String[] args) throws Exception {
      final Properties props = new Properties();
      props.put(StreamsConfig.APPLICATION_ID_CONFIG    ,
                "wordcount-app");
      props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG ,
                "kafka-broker1:9092");
      props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, 
                Serdes.String().getClass());
      props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG,
                Serdes.String().getClass());
  
      final StreamsBuilder builder = new StreamsBuilder();
      KStream˂String, String˃ txtLineStream 
        = builder.stream("TextLinesTopic");
      KTable˂String, Long˃ wordCounts = txtLineStream
          .flatMapValues(
                          textLine -˃
                          Arrays.asList(
                             textLine.toLowerCase().split("\\W+")
                          )
                        )
          .groupBy      ( (key, word) -˃ word )
          .count        ( Materialized.
                          ˂  String, Long, KeyValueStore˂Bytes, byte[]˃ ˃
                          as("counts-store"));
      wordCounts.
        toStream().
        to("WordsWithCountsTopic", 
           Produced.with(Serdes.String(), Serdes.Long()));
  
      KafkaStreams streams = new KafkaStreams(builder.build(), props);
      streams.start();
    }
  }

BºKafka Streams CORE CONCEPTSº
@[http://kafka.apache.org/25/documentation/streams/core-concepts]

  └ºStreamº   : graph of stream processors (nodes) that
   ºProcessorº  connected by streams (edges).
   ºtopologyº   

  └ºStreamº   : It represents an data set unbounded in size / time, 
                ordered, replayable and fault-tolerant inmmutable 
                record set.
 
  └ºStreamº   : node processing an step to transform data
   ºprocessorº  in streams.
                - It receives one input record at a time from
                  upstream processors and produces 1+ output records.

                - Special processors:
                  - Source Processor: NO upstream processors,
                    just one or multiple Kafka topics as input.

                  - Sink Processor: no down-stream processors. 
                    Outpus goes to Kafka topic/external system.

  └two ways to define the stream processing topology:
   - Streams DSL  : map, filter, join and aggregations
   - Processor API: (low-level) Custom processors. It also allows
                    to interact with state stores.

  └ºTime modelº: operations like windowing are defined based 
                 on time boundaries.  Common notions of time in
                streams are:
    - Event      time: 
    - Ingestion  time: time it's stored into a topic partition.
    - Processing time: It may be (milli)seconds for real time or
                       hours for batch time, after event time.
                       real event time.

    - Kafka 0.10.x+ automatically embeds (event or ingestion) time.
      Event of ingestion choose can be done at Kafka or topic level 
      in configuration.
    - Kafka Streams assigns a TS to every data record via the 
      TimestampExtractor interface allowing to describe the "progress"
      of a stream with regards to time and are leveraged by 
      time-dependent operations such as window operations. 

    - time only "advances" when a new record arrives at the 
      processor.  Concrete implementations of the TimestampExtractor
      interface will provide different semantics to the
      stream time definition.

    - Finally, Kafka Streams sinks will also assign timestamps
      in a way that depends on the context:
      - When output is generated from some input record,
        for example, context.forward(), TS is  inherited from input.
      - When new output record is generated via periodic
        functions such as Punctuator#punctuate(), TS is defined
        as current node internal time (context.timestamp()) .
      - For aggregations, result update record TS is the max.
        TS of all input records.
      NOTE: default behavior can be changed in the Processor API
            by assigning timestamps to output records explicitly 
            in "#forward()".

 
  └ºAggregationº: 
   ºOperation  º  
          INPUT                      OUTPUT
          --------                   ------
          KStream   → Aggregation →  KTable
       or KTable         
          ^             ^              ^
          DSL         Ex: count/sum  DSL object:
                                     - new value is considered
                                       to overwrite the old value
                                      ºwith the same keyºin next
                                       steps.

  └ºWindowingº:  - trackedºper record keyº.
                 - Available in ºStreams DSLº. 
                 - window.ºgrace_periodº controls 
                  ºhow long Streams clients will wait forº
                  ºout-of-order data records.º
                 - Records arriving "later" are discarded.
                   "late" == record.timestamp dictates it belongs 
                             to a window, but current stream time
                             is greater than the end of the window 
                             plus the grace period.

  └ºStates   º:  - Needed by some streams. 
                 -Bºstate storesº in Stream APIs allows apps to
                   store and query data, needed by stateful operations.
                 - Every task in Kafka Streams embeds 1+ state stores
                   that can be accessed via APIs to store and query
                   data required for processing. They can be:
                   -ºpersistent key-value storeº:
                   -ºIn-memory hashmapº
                   - "another convenient data structure".
                 - Kafka Streams offers fault-tolerance and automatic
                   recovery for local state-stores.
                 - directºread-only queriesºof the state stores
                   is provided to methods, threads, processes or
                   applications external to the stream processing
                   app through BºInteractive Queriesº, exposing the
                   underlying implementation of state-store read 
                   methods.

  └ºProcessingº: - at-least-once delivery 
   ºGuaranteesº    (processing.guarantee=exactly_once  in config)
                 - exactly-once processing semantics (Kafka 0.11+)
                                                      ^^^^^^^^^^
                Kafka 0.11.0+ allows producers to send messages to
                different topic partitions in transactional and
                idempotent manner.
                More specifically,Streams client APIguarantees that 
                for any record read from the source Kafka topics,
                its processing results will be reflected exactly once
                in the output Kafka topic as well as in the 
                state stores for stateful operations.
                (KIP-129 lecture recomended)


  └ºOut-of-Orderº:
   ºHandlingº
      - Within topic-partition:
        - records with larger timestamps but smaller offsets
          are processed earlier.
      - Within stream task processing "N" topic-partitions:
        - If app is Rºnot configured to wait for all partitionsº
        Rºto contain some buffered dataº and pick from the
          partition with the smallest timestamp to process
          the next record, timestamps may be smaller in 
          following records for different partitions.
          "FIX": Allows applications to wait for longer time
                  while bookkeeping their states during the wait time.
                  i.e. making trade-off decisions between latency,
                 cost, and correctness.
                 In particular, increase windows grace time.
           Rº As for Joins some "out-of-order" data cannot be handled
              by increasing on latency and cost in Streams yet:


Config See details: @[http://kafka.apache.org/documentation/#streamsconfigs] -ºCore config:º -ºapplication.id º: string unique within the Kafka cluster -ºbootstrap.servers º: host1:port1,host2:port2 ^^^^^^^^^^^^^^^^^^^^^^^ No need to add all host. Just a few ones to start sync -ºreplication.factorº: int, Default:1 -ºstate.dir º: string, Default: /tmp/kafka-streams -ºOther params:º - cache.max.bytes.buffering: long, def: 10485760 (max bytes for buffering ºacross all threadsº) - client.id : ID prefix string used for the client IDs of internal consumer, producer and restore-consumer, with pattern '-StreamThread--'. Default: "" - default.deserialization.exception.handler - default.key .serde : Default serializer / deserializer class - default.value.serde - default.production.exception.handler: Exception handling class - default.timestamp.extractor - max.task.idle.ms : long, Maximum amount of time a stream task will stay idle when not all of its partition buffers contain records, to avoid potential out-of-order record processing across multiple input streams. - num.standby.replicas: int (default to 0) - num.stream.threads : - processing.guarantee: at_least_once (default) | exactly_once. ^^^^^^^^^^^^ - It requires 3+ brokers in production - for development it can be changed by tunning broker setting - transaction.state.log.replication.factor - transaction.state.log.min.isr. - security.protocol : PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. - topology.optimization: [none*, all] Set wheher Kafka Streams should optimize the topology - application.server : Default: "", endpoint used for state store discovery and interactive queries on this KafkaStreams instance. - buffered.records.per.partition: int, default to 1000 - built.in.metrics.version - commit.interval.ms : frequency to save the position of the processor. (default to 100 for exactly_once or 30000 otherwise). - connections.max.idle.ms: Default: 540000 - metadata.max.age.ms: - metric.reporters : - metrics.num.samples: - metrics.recording.level: [INFO, DEBUG] - metrics.sample.window.ms - poll.ms : Amount of time in milliseconds to block waiting for input. - receive.buffer.bytes: int, def: 32768, size of the TCP receive buffer (SO_RCVBUF) to use when reading data. Set to -1 to use OS default. - send.buffer.bytes : in, def: 131072, size of the TCP send buffer (SO_SNDBUF ) to use when sending data. Set to -1 to use OS default. - reconnect.backoff.max.ms: - reconnect.backoff.ms - request.timeout.ms - retries : Setting a value greater than zero will cause the client to resend any request that fails with a potentially transient error. - retry.backoff.ms : - rocksdb.config.setter: - state.cleanup.delay.ms: - upgrade.from : - windowstore.changelog.additional.retention.ms:
Examples @[https://github.com/confluentinc/kafka-streams-examples] - Examples: Runnable Applications - Examples: Unit Tests - Examples: Integration Tests - Docker Example: Kafka Music demo application Example ºdocker-compose.ymlº: └───────┬────────┘ Services launched: zookeeper: kafka: schema-registry: kafka-create-topics: ... kafka-topics --create --topic play-events ... ... kafka-topics --create --topic song-feed ... kafka-music-data-generator: ← producer kafka-music-application: ← consumer - Examples: Event Streaming Platform
KSQL(ksqlDB)
@[https://www.confluent.io/]
@[https://www.confluent.io/blog/ksql-open-source-streaming-sql-for-apache-kafka/]

See also:
- Pull Queries and Connector Management Added to ksqlDB (KSQL) Event Streaming Database for Kafka (2019-12)
@[https://www.infoq.com/news/2019/12/ksql-ksqldb-streaming-database/]
  - pull queries to allow for data to be read at a specific point in 
    time using a SQL syntax, and Connector management that enables direct 
    control and execution of connectors built to work with Kafka Connect. 
  - Until now KSQL has only been able to query continuous streams 
    ("push queries"). Now it can also read current state of a materialized 
    view using pull queries:
    These new queries can run with predictable low latency since the
    materialized views are updated incrementally as soon as new
    messages arrive.
  - With the new connector management and its built-in support for a 
    range of connectors, it’s now possible to directly control and 
    execute these connectors with ksqlDB, instead of creating separate 
    solutions using Kafka, Connect, and KSQL to connect to external data 
    sources. 
    The motive for this feature is that the development team 
    believes building applications around event streams Rºwas too complexº. 
  BºInstead, they want to achieve the same simplicity as when buildingº
  Bºapplications using relational databases.                          º

  - Internally, ksqlDB architecture is based on a distributed commit 
    log used to synchronize the data across nodes. To manage state, 
    RocksDB is used to provide a local queryable storage on disk. A 
    commit log is used to update the state in sequences and to provide 
    failover across instances for high availability.
Developer Tools
Docker Img For Developers
@[https://github.com/lensesio/fast-data-dev]
- Apache Kafka docker image for developers; with Lenses (lensesio/box) 
  or Lenses.io's open source UI tools (lensesio/fast-data-dev). Have a 
  full fledged Kafka installation up and running in seconds and top it 
  off with a modern streaming platform (only for kafka-lenses-dev), 
  intuitive UIs and extra goodies. Also includes Kafka Connect, Schema 
  Registry, Lenses.io's Stream Reactor 25+ Connectors and more.

  Ex SystemD integration file:
  /etc/systemd/system/kafkaDev.service
  | #!/bin/bash
  | 
  | # Visit http://localhost:3030 to get into the fast-data-dev environment
  | 
  | [Unit]
  | Description=Kafka Lensesio
  | After=docker.service
  | Wants=network-online.target docker.socket
  | Requires=docker.socket
  | 
  | [Service]
  | Restart=always
  | # Create container if it doesn't exists with container inspect
  | ExecStartPre=/bin/bash -c "/usr/bin/docker container inspect lensesio/fast-data-dev 2> /dev/null || /usr/bin/docker run -d --name kafkaDev --net=host -v /var/backups/DOCKER_VOLUMES_HOME/kafkaDev:/data lensesio/fast-data-dev" 
  | ExecStart=/usr/bin/docker start -a    kafkaDev 
  | ExecStop=/usr/bin/docker   stop -t 20 kafkaDev 
  | 
  | [Install]
  | WantedBy=multi-user.target
DevOps
Ansible Install
@[https://github.com/confluentinc/cp-ansible]
@[https://docs.confluent.io/current/installation/cp-ansible/index.html]
- Installs Confluent Platform packages.
  - ZooKeeper
  - Kafka
  - Schema Registry
  - REST Proxy
  - Confluent Control Center
  - Kafka Connect (distributed mode)
  - KSQL Server

- systemd Integration

- Configuration options for:
   plaintext, SSL, SASL_SSL, and Kerberos.



pom.xml

ºProducer, consumer, AdminClient APIsº   ºStreams APIº
˂dependency˃                             ˂dependency˃
  ˂groupId˃org.apache.kafka˂/groupId˃      ˂groupId˃org.apache.kafka˂/groupId˃
  ˂artifactId˃kafka-clients˂/artifactId˃   ˂artifactId˃kafka-streams˂/artifactId˃
  ˂version˃1.1.0˂/version˃                 ˂version˃1.1.0˂/version˃
˂/dependency˃                            ˂/dependency˃
Quick start
@[http://kafka.apache.org/documentation.html#quickstart]
BºPRE-SETUPº
  Download tarball from mirror @[https://kafka.apache.org/downloads],
  then untar like:
  $º$ tar -xzf - kafka_2.13-2.5.0.tgz ; cd kafka_2.13-2.5.0º

  - Start the server
  $º$ ~ bin/zookeeper-server-start.sh \                    º ← Start Zookeeper
  $º  config/zookeeper.properties 1˃zookeeper.log 2˃⅋1 ⅋   º

  $º$ ~ bin/kafka-server-start.sh  \                       º ← Start Kafka Server
  $º  config/server.properties 1˃kafka.log  2˃⅋1 ⅋ \       º

  $º$ bin/kafka-topics.sh --create --zookeeper \           º ← Create BºTESTº topic
  $º  localhost:2181  --replication-factor 1 \             º (Alt.brokers can be
  $º  --partitions 1 --topic TEST                          º  configured to  auto-create 
                              └──┘                            them when publishing to
                                                              non-existent ones)

  $º$ bin/kafka-topics.sh --list --zookeeper localhost:2181º ← Check topic
  $º$ TEST                                                 º ← Expected output           
      └──┘

  $º$ bin/kafka-console-producer.sh --broker-list          º ← Send some messages
  $º  localhost:9092 --topic TEST                          º
  $ºThis is a message                                      º
  $ºThis is another message                                º
  $ºCtrl+V                                                 º

  $º$ bin/kafka-console-consumer.sh --bootstrap-server \   º ← Start a consumer
  $ºlocalhost:9092 --topic Bºtestº --from-beginning        º
  $ºThis is a message                                      º
  $ºThis is another message                                º


  ********************************
  * CREATING A 3 BROKERS CLUSTER *
  ********************************
  $º$ cp config/server.properties config/server-1.propertiesº ← Add 2 new broker configs.
  $º$ cp config/server.properties config/server-2.propertiesº
                                  └───────────┬────────────┘ 
               ┌──────────────────────────────┤
   ┌───────────┴────────────┐      ┌──────────┴─────────────┐  
  ºconfig/server-1.properties:º   ºconfig/server-2.properties:º
   broker.id=1                     broker.id=2                 ← unique id 
   listeners=PLAINTEXT://:9093     listeners=PLAINTEXT://:9094
   log.dir=/tmp/kafka-logs-1       log.dir=/tmp/kafka-logs-2   ← avoid overwrite


  $º$ bin/kafka-server-start.sh config/server-1.properties ...º ← Start 2nd cluser
  $º$ bin/kafka-server-start.sh config/server-2.properties ...º ← Start 3rd cluser

  $º$ bin/kafka-topics.sh --create --zookeeper localhost:2181\º ← Create new topic with
  $º  --replication-factor 3 --partitions 1                   º Bºreplication factor of 3º
  $º  --topic  topic02                                        º

  $º$ bin/kafka-topics.sh --describe \                        º ← Check know which broker
  $º  --zookeeper localhost:2181 --topic topic02              º   is doing what
   (output will be similar to)
   Topic: topic02  PartitionCount:1  ReplicationFactor:3 Configs:         ← summary 
       Topic: topic02 Partition:0 Leader:1  Replicas: 1,2,0 Isr: 1,2,0 ← Partition 0
                                                            └┬┘
                                            set of "in-sync" replicas. 
                                           (subset of "replicas" currently
                                            alive and in sync with "leader")


  *****************************************
  * Using Kafka Connect to import/export  *
  * data using simple connectors          *
  *****************************************

BºPRE-SETUPº
  $º$ echo -e "foo\nbar" > test.txt       º ← Prepare (input)test data

  - SETUP source/sink Connectors:
    config/connect-file-srcs.properties ← unique_connector_id, connector class,
    config/connect-file-sink.properties   ...

  $º$ bin/connect-standalone.sh \            º ← Start Bºtwo connectorsº running in 
  $º  \                                      º   standalone mode (dedicated process)
  $º  config/connect-standalone.properties \ º ← 1st param is common Kafka-Connect config 
  $º  \                                      º   (brokers, serialization format ,...)
  $º  config/connect-file-srcs.properties  \ º 
  $º  config/connect-file-sink.properties    º 
    └─────────────────┬────────────────────┘
    examples in kafka distribution set the "pipeline" like:
    "test.txt" → connect-test  → sink connector → "test.sink.txt"


Configuration
Broker Config
Broker config: @[http://kafka.apache.org/documentation/#configuration]
  Main params:
broker.id
log.dirs
zookeeper.connect
Topics Conf
Topics config: @[http://kafka.apache.org/documentation/#topicconfigs"
Compaction
http://kafka.apache.org/documentation.html#compaction
Kafka+Kubernetes
Strimzi (Images+Operators)
@[https://developers.redhat.com/blog/2019/06/06/accessing-apache-kafka-in-strimzi-part-1-introduction/]
@[https://developers.redhat.com/blog/2019/06/07/accessing-apache-kafka-in-strimzi-part-2-node-ports/]
@[https://developers.redhat.com/blog/2019/06/10/accessing-apache-kafka-in-strimzi-part-3-red-hat-openshift-routes/]
@[https://developers.redhat.com/blog/2019/06/11/accessing-apache-kafka-in-strimzi-part-4-load-balancers/]
@[https://developers.redhat.com/blog/2019/06/12/accessing-apache-kafka-in-strimzi-part-5-ingress/]

- Strimzi: open source project that provides container images and 
Bºoperatorsº for running Apache Kafka on Kubernetes and Red Hat 
  OpenShift. Scalability is one of the flagship features of Apache 
  Kafka. It is achieved by partitioning the data and distributing them 
  across multiple brokers. Such data sharding has also a big impact on 
  how Kafka clients connect to the brokers. This is especially visible 
  when Kafka is running within a platform like Kubernetes but is 
  accessed from outside of that platform.

Unordered
Kafka Design
For details about the Kafka's commit log storage and replication design:
https://kafka.apache.org/documentation/#design
Security
  Security Docs
Burrow Monit
@[https://dzone.com/articles/kafka-monitoring-with-burrow]
Best Pracites
@[https://www.infoq.com/articles/apache-kafka-best-practices-to-optimize-your-deployment]
Mirror Maker (geo-replica)
@[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330]
fka's mirroring feature makes it possible to maintain a replica of an 
existing Kafka cluster. The following diagram shows how to use the 
MirrorMaker tool to mirror a source Kafka cluster into a target 
(mirror) Kafka cluster. The tool uses a Kafka consumer to consume 
messages from the source cluster, and re-publishes those messages to 
the local (target) cluster using an embedded Kafka producer.
Akka+Kotlin:Data Pipes
@[https://www.kotlindevelopment.com/data-pipelines-kotlin-akka-kafka/]
- Objective: 
  Create a "GitHub monitor" that build an analytics component that
  polls one of your services (GitHub), writes data into a message queue
  for later analysis, then, after post-processing, updates statistics in
  a SQL database.

- Tooling: Akka Streams + Alpakka connector collection

- "Architecture":
  1) polls GitHub Event API for kotlin activity
  2) writes all events into a Kafka topic for later use
  3) reads events from Kafka and filters out PushEvents
  4) updates a Postgres database with:
     - who pushed changes
     - when
     - into which repository.

- Akka summary: data is moving from Sources to Sinks.
  (Observable and Sink in RxJava)

ºENTRY POINTº
(standard main function)
  fun main(vararg args: String) {
    val system = ActorSystem.create()                          // "boilerplate" for using Akka and Akka Streams
    val materializer = ActorMaterializer.create(system)
    val gitHubClient = GitHubClient(system, materializer)      // instance used to poll the GitHub events API
    val eventsProducer = EventsProducer(system, materializer)  // instance used to write events into Kafka
    val eventsConsumer = EventsConsumer(system)                // instance used to read events from Kafka
    val pushEventProcessor = PushEventProcessor(materializer)  // instance used to filter PushEvents and update the database
                                                               
    eventsProducer.write(gitHubClient.events())                // put things in motion.
    pushEventProcessor.run(eventsConsumer.read())
  }

  Each time we receive a response from GitHub, we parse it and send individual events downstream.
  fun events(): Source˂JsonNode, NotUsed˃ =
    poll().flatMapConcat { response -˃
      response.nodesOpt
        .map { nodes -˃ Source.from(nodes) }
        .orElse(Source.empty())
    }

ºEventsProducer and EventsConsumerº
(the "power" of Akka Streams and Alpakka)

- Akka-Streams-Kafka greatly reduces the amount of code
  that we have to write for integrating with Kafka.
  Publishing events into a Kafka topic look like:

  fun write(events: Source˂JsonNode, NotUsed˃)
      : CompletionStage˂Done˃
        =  events.map {                                        // ← maps GitHub event to(Kafka)ProducerRecord 
             node -˃ ProducerRecord˂ByteArray, String˃
                        ("kotlin-events",
                         objectMapper.writeValueAsString(node) // ← serialize JsonNode as a String
                        ) 
      }.runWith(                        // ← connects Source Sink  
          Producer.plainSink(settings), materializer)                                         
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        - defined@Akka Streams Kafka
        - takes care of communicating with Kafka

- The other way around, reading from Kafka, is also super simple.

  fun read(): Source˂JsonNode, NotUsed˃ =
    Consumer.plainSource(
        settings,
        Subscriptions.assignmentWithOffset(
            TopicPartition("kotlin-events", 0), 0L))
       .map { record -˃ objectMapper.readTree(record.value()) }
       .mapMaterializedValue { NotUsed.getInstance() }

  At this point, we have a copy of GitHub's events feed for github.com/Kotlin stored in Kafka,
so we can time travel and run different analytics jobs on our local dataset.

ºPushEventProcessorº
 we want to filter out PushEvents from the stream
 and update a Postgres database with the results.
 - Alpakka Slick (JDBC) Connector is used to connect to PostgreSQL.
   fun createTableIfNotExists(): Source˂Int, NotUsed˃ {
     val ddl =
       """
         |CREATE TABLE IF NOT EXISTS kotlin_push_events(
         |  id         BIGINT    NOT NULL,
         |  name       VARCHAR   NOT NULL,
         |  timestamp  TIMESTAMP NOT NULL,
         |  repository VARCHAR   NOT NULL,
         |  branch     VARCHAR   NOT NULL,
         |  commits    INTEGER   NOT NULL
         |);
         |CREATE UNIQUE INDEX IF NOT EXISTS id_index ON kotlin_push_events (id);
       """.trimMargin()
     return Slick.source(session, ddl, { _ -˃ 0 })
   }

 - Similarly, the function to update the database looks like this.

   fun Source˂PushEvent, NotUsed˃.updateDatabase() :
       CompletionStage˂Done˃ =
           createTableIfNotExists().flatMapConcat { this }
       .runWith(Slick.sink˂PushEvent˃(session, 20, { event -˃
         """
           |INSERT INTO kotlin_push_events
           |(id, name, timestamp, repository, branch, commits)
           |VALUES (
           |  ${event.id},
           |  '${event.actor.login}',
           |  '${Timestamp.valueOf(event.created_at)}',
           |  '${event.repo.name}',
           |  '${event.payload.ref}',
           |  ${event.payload.distinct_size}
           |)
           |ON CONFLICT DO NOTHING
         """.trimMargin()
       }), materializer)
 
   We are almost done, what's left is filtering and mapping from JsonNode to
   PushEvent and composing the methods together.

   fun Source˂JsonNode, NotUsed˃.filterPushEvents(): Source˂PushEvent, NotUsed˃ =
     filter { node -˃ node["type"].asText() == "PushEvent" }
       .map { node -˃ objectMapper.convertValue(node, PushEvent::class.java) }

   And finally, all the functions composed together look like this.
   fun run(events: Source˂JsonNode, NotUsed˃): CompletionStage˂Done˃ =
     events
       .filterPushEvents()
       .updateDatabase()

  This is why we've used the extension methods above, so we can describe 
  transformations like this, simply chained together. 
  That's it, after running the app for a while (gradle app:run) we can see 
  the activities around different Kotlin repositories. 


You can find the complete source code on GitHub.

    A very nice property of using Akka Streams and Alpakka is that it makes 
  really easy to migrate/reuse your code, e.g. in case you want to store data 
  in Cassandra later on instead of Postgres. All you would have to do is define 
  a different Sink with CassandraSink.create. Or if GitHub events would be 
  dumped in a file located in AWS S3 instead of published to Kafka, all you 
  would have to do is create a Source with S3Client.download(bucket, key). The 
  current list of available connectors is located here, and the list is growing.
Pixy
@[https://github.com/mailgun/kafka-pixy]
Kafka-Pixy is a dual API (gRPC and REST) proxy for Kafka with 
automatic consumer group control. It is designed to hide the 
complexity of the Kafka client protocol and provide a stupid simple 
API that is trivial to implement in any language.
Pulsar vs Kafka
- Pulsar is a younger project Bºinspired and informed by Kafkaº. 
  Kafka on the other side has a bigger community. 

BºPulsas "PRO"s over Kafkaº
Bº========================º
@[https://kafkaesque.io/7-reasons-we-choose-apache-pulsar-over-apache-kafka/]
@[https://kesque.com/5-more-reasons-to-choose-apache-pulsar-over-kafka/]

1. Streaming and queuing Come together:
   Pulsar supports standard message queuing patterns, such as
   competing consumers, fail-over subscriptions, and easy message 
   fan out keeping track of the client read position in the topic 
   and stores that information in its high-performance distributed 
   ledger, Apache BookKeeper, handling many of the use cases of a 
   traditional queuing system, like RabbitMQ.
 
2. Simpler ussage:
   If you don't need partition you don't have to worry about them.
   "If you just need a topic, then use a topic". Do not worry about
   how many consumers the topic might have. 
   Pulsar subscriptions allow you to add as many consumers as you want 
   on a topic with Pulsar keeping track of it all. If your consuming 
   application can’t keep up, you just use a shared subscription to 
   distribute the load between multiple consumers.
   Pulsar hasºpartitioned topicsºif you need them, but 
  ºonly if you need themº.

3. Fitting a log on a single server becomes a challenge (Disk full,
   remote copy o large logs can take a long time.
   More info at "Adding a New Broker Results in Terrible Performance"
 @[https://www.confluent.io/blog/stories-front-lessons-learned-supporting-apache-kafka/]
   Apache Pulsar breaks logs into segments and distributes them across
   multiple servers while the data is being written by using BookKeeper
   as its storage layer. 
   This means that the Bºlog is never stored on a single serverº, so a
   single server is never a bottleneck. Failure scenarios are easier to
   deal with and Bºscaling out is a snap: Just add another server.º
 BºNo rebalancing needed.º

4. Stateless Brokers:
   In Kafka each broker contains the complete log for each of
   its partitions. If load gets too high, you can't simply add
   another broker. Brokers must synchronize state from other 
   brokers that contain replicas of its partitions.
   In Pulsar brokers accept data from producers and send data
   to consumers, but the data is stored in Apache BookKeeper.
   If load gets high, just add another broker.
   It starts up quickly and gets to work right away.

5. Geo-replication is a first-class feature in Pulsar.
   (vs proprietary add-on).
 BºConfiguring it is easy and it just works. No PhD needed.º
   
6. Consistently Faster
 BºPulsar delivers higher throughput along with lower and moreº
 Bºconsistent latency.º

7. All Apache Open Source
   input and output connectors (Pulsar IO), SQL-based topic queries 
   (Pulsar SQL), schema registry,...
   (vs Kafka open-source features controlled by a commercial entity)

8. Pulsar can have multiple tenants and those tenants can have 
   multiple namespaces to keep things all organized. Add to that
   access controls, quotas, and rate-limiting for each namespace 
   and you can imagine a future where we can all get along using
   just this one cluster.
   (WiP or Kafka in KIP-37).

9. Replication
   You want to make sure your messages never get lost. In Kakfa
   you configure 2 or 3 replicas of each message in case 
   something goes wrong.
   In Kafka the leader stores the message and the followers make
   a copy of it. Once enough followers acknowledge they’ve got it,
   "Kafka is happy".
   Pulsar uses a quorum model: It sends the message out to a 
   bunch of nodes, and once enough of them acknowledge they've
   got it, "Pulsar is happy". Majority always wins, and all votes 
   are equal giving more consistent latency behavior over time.
 @[https://kafkaesque.io/performance-comparison-between-apache-pulsar-and-kafka-latency/]
   (Kafka quorum is also a WiP in KIP-250)

10.Tiered storage.
   What if you like to store messages forever (event-sourcing)?
   It can get expensive on main high-performance SSDs.
 BºWith Pulsar tiered storage you can automatically push old messages   º
 Bºinto practically infinite, cheap cloud storage (S3) and retrieve themº
 Bºjust like you do those newer, fresh-as-a-daisy messages.º
   (Kafka describes this feature in KIP-405).

11.End-to-end encryption
   Producer Java client can encrypt message using shared keys with
   the consumer. (hidden info to broker).
   Kafka is in feature-request state (KIP-317).

12.When an (stateless) Pulsar broker gets overloaded, pulsar rebalance
   request from clients automatically.
   It monitors the broker ussage of CPU, memory, and network (vs disk, since 
   brokers is stateless) to take the decision to balance.
   No need to add a new broker until all brokers are at full.
   In Kafka load-balancing is done by  installing another package 
   such as LinkedIn's Cruise Control or paying for Confluent's rebalancer
   tool.

— See also: [TODO]
@[https://streamnative.io/blog/tech/pulsar-vs-kafka-part-1]
Leasson Learned
@[https://www.confluent.io/blog/stories-front-lessons-learned-supporting-apache-kafka/]
- Under-replicated Partitions Continue to Grow Inexplicably
- Kafka Liveness Check and Automation Causes Full Cluster Down
- Adding a New Broker Results in Terrible Performance
Kafka Backup
@[https://github.com/itadventurer/kafka-backup]
- Kafka Backup is a tool to back up and restore your Kafka data 
  including all (configurable) topic data and especially also consumer 
  group offsets. To the best of our knowledge, Kafka Backup is the only 
  viable solution to take a cold backup of your Kafka data and restore 
  it correctly.
Faust (Python Streams)
https://github.com/robinhood/faust

- Faust provides both stream processing and event processing, sharing 
  similarity with tools such as Kafka Streams, Apache 
  Spark/Storm/Samza/Flink,

ELK

ELK
ElasticSearch (Search Engine)
ELK Summary
-ºElasticSearch cluster solutions in practice means:º
  - Deploy dozens of machines/containers.
  - Server tuning
  - mapping writing: i.e., - deciding how the data is:
    - tokenized
    - analyzed
    - indexed
    depending of user's requirements, internal limitations, and 
    available hardware resources.

- Elasticsearch is the central piece of ELK (Elastic Search, Logstash, and Kibana)
- developed and maintained by Elastic.
- Elasticsearch:
  - Add clustering and enterprise features on top of the Apache Lucene Library.
  - Originally designed for text-search indexation. 
    New versions can be used as a general data indexing solution.
    Extracted from: @[https://docs.sonarqube.org/latest/requirements/requirements/]
    """ ...the "data" folder houses the Elasticsearch indices on which a huge amount
        of I/O will be done when the server is up and running. Great read ⅋ write
        hard drive performance will therefore have a great impact on the overall
        server performance.
    """
Cerebro: admin GUI
@[https://www.redhat.com/sysadmin/cerebro-webui-elk-cluster]
Elasticsearch comes as a set of blocks, and you—as a designer - are supposed
to glue them together. Yet, the way the software comes out of the box does not
cover everything. So, to me, it was not easy to see the cluster's heartbeat
all in one place. I needed something to give me an overview as well as allow me
to take action on basic things.
I wanted to introduce you to a helpful piece of software I found: Cerebro.

According to the Cerebro GitHub page:
Cerebro is an open source (MIT License) elasticsearch web admin tool built
using Scala, Play Framework, AngularJS, and Bootstrap.
Solr (ElasticSearch Alternative)
@[http://lucene.apache.org/solr/]
- blazing-fast, search platform built on Apache Lucene

- Doesn't include analytics engine like ElasticSearch
Logstash (Collector)

- Logstash: log pipeline (ingest/transform/load it into a store
            like Elasticsearch)
  (It is common to replace Logstash with Fluentd)
- Kibana: visualization layer on top of Elasticsearch.
- In productiona few other pieces might be included like:
  - Kafka, Redis, NGINX, ....


logstash-plugins
https://github.com/logstash-plugins
Plugins that can extend Logstash's functionality.
   logstash-codec-protobuf  parsing Protobuf messages
   logstash-integration-kafka
   logstash-filter-elasticsearch
   amazon-s3-storage
   logstash-codec-csv
   ...
Logstash Alternatives
Graylog
- log management tool based on Java, ElasticSearch and MongoDB.
   Graylog can be used to collect, index and analyze 
   any server log from a centralized location or distributed location. We can 
   easily monitor any unusual activity for debugging applications and logs using 
   Graylog. Graylog provides a powerful query language, alerting abilities, a 
   processing pipeline for data transformation and much more. We also can extend 
   the functionality of Graylog through a REST API and Add-ons.
  - gaining popularity in the Go community with
    the introduction of the Graylog Collector Sidecar
    written in Go.
  - ... it still lags far behind the ELK stack.
  - Composed  under the hood of:
    - Elasticsearch
    - MongoDB
    - Graylog Server.
  - comes with alerting built into the open source version
    and streaming, message rewriting, geolocation, ....
  - Ex:
  @[https://www.howtoforge.com/how-to-monitor-log-files-with-graylog-v31-on-debian-10/]
Fluentd
-WºIt is not a global log aggregation system but a local one.º
-  Adopted by the CNCF as Incubating project.
-  Recommended by AWS and Google Cloud.
-  Common replacement for Logstash
    - It acts as a local aggregator to collect all node logs and send them off to central
      storage systems.
-Oº500+ plugins for  quick and easy integrations withº
 Oºdifferent data input/outputs.                     º
-Bºcommon choice in Kubernetes environments due to:º
 Bº- low memory requirements (tens of megabytes)   º
 Bº  (each pod has a Fluentd sidecar)              º
 Bº- high throughput.                              º
Kibana (UI)
Kibana Summary
- Analytical UI results extracted from ElasticSearch.

Kibana Alternatives - Grafana: lower level UI for infrastructure data visualization. Directly can inject data from many sources. Kibana must use logstash to inject from non-elasticsearch sources. - Tableau: Very simple to use and ability to produce interactive visualizations. - well suited to handling the huge and very fast-changing datasets - integrates with Hadoop, Amazon AWS, My SQL, SAP, Teradata, .... - Qlikview: Tableau’s biggest competitor. - FusionCharts: non free JavaScript-based charting and visualization package 90+ different chart types, rather than having to start each new visualization from scratch, users can pick from "live" example templates. - Highcharts: non free. (It can be used freely as a trial, non-commercial or personal use). - Datawrapper: increasingly popular choice to present charts and statistics. - simple, clear interface - very easy to upload csv data and create straightforward charts, also maps, ... - Plotly : Plotly enables more complex and sophisticated visualizations, -BºIntegrats with analytics-oriented programming languages such as Python, R and Matlab.º - built on top of the open source d3.js. - Free/Non free licence. - support for APIs such as Salesforce. - Sisense : full stack analytics platform but its visualization capabilities. simple-to-use drag and drop interface. Dashboards can then be shared across organizations.

Non-Ordered

Non-Ordered
InfluxDB (Elas.Search Alt from TimeSeries
@[https://logz.io/blog/influxdb-vs-elasticsearch/]
awesome influxdb
https://github.com/PoeBlu/awesome-influxdb
A curated list of awesome projects, libraries, tools, etc. related to InfluxDB
Tools whose primary or sole purpose is to feed data into InfluxDB.

- accelerometer2influx - Android application that takes the x-y-z axis metrics 
   from your phone accelerometer and sends the data to InfluxDB.
- agento - Client/server collecting near realtime metrics from Linux hosts
- aggregateD - A dogstatsD inspired metrics and event aggregation daemon for 
  InfluxDB
- aprs2influxdb - Interfaces ham radio APRS-IS servers and saves packet data 
  into an influxdb database
- Charmander - Charmander is a lab environment for measuring and analyzing 
  resource-scheduling algorithms
- gopherwx - a service that pulls live weather data from a Davis Instruments 
  Vantage Pro2 station and stores it in InfluxDB
- grade - Track Go benchmark performance over time by storing results in 
  InfluxDB
- Influx-Capacitor - Influx-Capacitor collects metrics from windows machines 
  using Performance Counters. Data is sent to influxDB to be viewable by grafana
- Influxdb-Powershell - Powershell script to send Windows Performance counters 
  to an InfluxDB Server
- influxdb-logger - SmartApp to log SmartThings device attributes to an 
  InfluxDB database
- influxdb-sqlserver - Collect Microsoft SQL Server metrics for reporting to 
  InfluxDB and visualize them with Grafana
- k6 - A modern load testing tool, using Go and JavaScript
- marathon-event-metrics - a tool for reporting Marathon events to InfluxDB
- mesos-influxdb-collector - Lightweight mesos stats collector for InfluxDB
- mqforward - MQTT to influxdb forwarder
- node-opcua-logger - Collect industrial data from OPC UA Servers
- ntp_checker - compares internal NTP sources and warns if the offset between 
  servers exceeds a definable (fraction of) seconds
- proc_to_influxdb - Console app to observe Windows process starts and stops 
  via InfluxDB
- pysysinfo_influxdb - Periodically send system information into influxdb (uses 
  python3 + psutil, so it also works under Windows)
- sysinfo_influxdb - Collect and send system (linux) info to InfluxDB
- snmpcollector - A full featured Generic SNMP data collector with Web 
  Administration Interface for InfluxDB
- Telegraf - (Official) plugin-driven server agent for reporting metrics into 
  InfluxDB
- tesla-streamer - Streams data from Tesla Model S to InfluxDB (rake task)
- traffic_stats - Acquires and stores statistics about CDNs controlled by 
  Apache Traffic Control
- vsphere-influxdb-go - Collect VMware vSphere, vCenter and ESXi performance 
  metrics and send them to InfluxDB
MFA OOSS
https://opensource.com/article/20/3/open-source-multi-factor-authentication Open source alternative for multi-factor authentication: privacyIDEA 
Chrony (NTP replacement)
https://www.infoq.com/news/2020/03/ntp-chrony-facebook/ 
Facebook’s Switch from ntpd to chrony for a More Accurate, Scalable NTP Service
Apache Beam
Apache Beam provides an advanced unified programming model, allowing 
you to implement batch and streaming data processing jobs that can 
run on any execution engine.
Allows to execute pipelines on multiple environments such as Apache 
Apex, Apache Flink, Apache Spark among others.
Apache Ignite
Apache Ignite is a high-performance, integrated and distributed 
in-memory platform for computing and transacting on large-scale data 
sets in real-time, orders of magnitude faster than possible with 
traditional disk-based or flash technologies.
Apache PrestoDB
@[https://prestodb.io/]
- distributed SQL query engine originally developed by Facebook.
- running interactive analytic queries against data sources of
  all sizes ranging from gigabytes to petabytes.
- Engine can combine data from multiple sources (RDBMS, No-SQL, 
  Hadoop) within a single query, and it has little to no performance 
  degradation running. Being used and developed by big data giants 
  like, among others,  Facebook, Twitter and Netflix, guarantees a 
  bright future for this tool.
- Designed and written from the ground up for interactive 
  analytics and approaches the speed of commercial data warehouses 
  while scaling to the size of organizations like Facebook. 
- NOTE: Presto uses Apache Avro to represent data with schemas.
  (Avro is "similar" to Google Protobuf/gRPC)
Google Colossus FS
@[https://www.systutorials.com/3202/colossus-successor-to-google-file-system-gfs/]
ddbb from scratch
Writing a SQL database from scratch in Go: 3. indexes | notes.eatonphil.com !!!
https://notes.eatonphil.com/database-basics-indexes.html 
Mattermost+Discourse
CERN, cambia el uso de Facebook Workplace por Mattermost y Discourse
https://www.linuxadictos.com/cern-cambia-el-uso-de-facebook-workplace-por-mattermost-y-discourse.html 
Intel Optane DC performance
http://mikaelronstrom.blogspot.com/2020/02/benchmarking-5-tb-data-node-in-ndb.html?m=1
Through the courtesy of Intel I have access to a machine with 6 TB of Intel
Optane DC Persistent Memory. This is memory that can be used both as
persistent memory in App Direct Mode or simply used as a very large
DRAM in Memory Mode.

Slides for a presentation of this is available at slideshare.net.

This memory can be bigger than DRAM, but has some different characteristics
compared to DRAM. Due to this different characteristics all accesses to this
memory goes through a cache and here the cache is the entire DRAM in the
machine.

In the test machine there was a 768 GB DRAM acting as a cache for the
6 TB of persistent memory. When a miss happens in the DRAM cache
one has to go towards the persistent memory instead. The persistent memory
has higher latency and lower throughput. Thus it is important as a programmer
to ensure that your product can work with this new memory.

What one can expect performance-wise is that performance will be similar to
using DRAM as long as the working set is smaller than DRAM. As the working
set grows one expects the performance to drop a bit, but not in a very significant
manner.

We tested NDB Cluster using the DBT2 benchmark which is based on the
standard TPC-C benchmark but uses zero latency between transactions in
the benchmark client.

This benchmark has two phases, the first phase loads the data from 32 threads
where each threads loads one warehouse at a time. Each warehouse contains
almost 500.000 rows in a number of tables.
SOAP vs RESTfull vs ...
SOAP vs RESTfull vs JSON-RPC vs gRPC vs Drift
EventQL
eventql/eventql: Distributed "massively parallel" SQL query engine
https://github.com/eventql/eventql 
n-gram index
 "...Developed an indexing and search software, mostly in C. 
  The indexer is multi-threaded and computes a n-gram index 
  - stored in B-Trees - on hundreds of GB of data generated 
  every day in production. The associated search engine is also
  a multi-threaded, efficient C program."

- In the fields of computational linguistics and probability, an n-gram 
  is a contiguous sequence of n items from a given sample of text or 
  speech. The items can be phonemes, syllables, letters, words or base 
  pairs according to the application. The n-grams typically are 
  collected from a text or speech corpus. When the items are words, 
  n-grams may also be called shingles[clarification needed].
Apollo GraphQL Middelware
https://www.infoq.com/news/2020/06/apollo-platform-graphql/
Apollo Data Graph Platform: A GraphQL Middleware Layer for the Enterprise
Genesis: end-to-end testing
Genesis: An End-To-End Testing & Development Platform For Distributed Systems
https://github.com/EntEthAlliance/genesis/branches
Outputting data to collect for later analysis is as simple as writing 
a JSON object to stdout. You can review the data we collect from each 
test on the test details page in the dashboard, build your own 
metrics and analytics dashboards with Kibana, and write your own 
custom data analysis and reports using Jupyter Notebooks.
fallacies of distributed computing
@[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing]
The fallacies are[1]

    The network is reliable;
    Latency is zero;
    Bandwidth is infinite;
    The network is secure;
    Topology doesn't change;
    There is one administrator;
    Transport cost is zero;
    The network is homogeneous.

The effects of the fallacies
- Software applications are written with little error-handling on 
  networking errors. During a network outage, such applications may 
  stall or infinitely wait for an answer packet, permanently consuming 
  memory or other resources. When the failed network becomes available, 
  those applications may also fail to retry any stalled operations or 
  require a (manual) restart.
- Ignorance of network latency, and of the packet loss it can cause, 
  induces application- and transport-layer developers to allow 
  unbounded traffic, greatly increasing dropped packets and wasting 
  bandwidth.
- Ignorance of bandwidth limits on the part of traffic senders can 
  result in bottlenecks.
- Complacency regarding network security results in being blindsided 
  by malicious users and programs that continually adapt to security 
  measures.[2]
- Changes in network topology can have effects on both bandwidth and 
  latency issues, and therefore can have similar problems.
- Multiple administrators, as with subnets for rival companies, may 
  institute conflicting policies of which senders of network traffic 
  must be aware in order to complete their desired paths.
- The "hidden" costs of building and maintaining a network or subnet 
  are non-negligible and must consequently be noted in budgets to avoid 
  vast shortfalls.
- If a system assumes a homogeneous network, then it can lead to the 
  same problems that result from the first three fallacies.
Jepsen Project
https://jepsen.io/
Jepsen is an effort to improve the safety of distributed databases, queues, 
consensus systems, etc. We maintain an open source software library for systems 
testing, as well as blog posts and conference talks exploring particular 
systems’ failure modes. In each analysis we explore whether the system lives 
up to its documentation’s claims, file new bugs, and suggest recommendations 
for operators.

Jepsen pushes vendors to make accurate claims and test their software 
rigorously, helps users choose databases and queues that fit their needs, and 
teaches engineers how to evaluate distributed systems correctness for 
themselves.
Time Series + Spark
https://github.com/AmadeusITGroup/Time-Series-Library-with-Spark
NATS

Extracted from:
https://docs.baseline-protocol.org/baseline-protocol/packages/messaging
NATS is currently the default point-to-point messaging provider and 
the recommended way for organizations to exchange secure protocol 
messages. NATS was chosen due to its high-performance capabilities, 
community/enterprise footprint, interoperability with other systems 
and protocols (i.e. Kafka and MQTT) and its decentralized 
architecture.

https://docs.nats.io/
The Importance of Messaging
Developing and deploying applications and services that communicate 
in distributed systems can be complex and difficult. However there 
are two basic patterns, request/reply or RPC for services, and event 
and data streams. A modern technology should provide features to make 
this easier, scalable, secure, location independent and observable.
Distributed Computing Needs of Today
A modern messaging system needs to support multiple communication 
patterns, be secure by default, support multiple qualities of 
service, and provide secure multi-tenancy for a truly shared 
infrastructure. A modern system needs to include:
- Secure by default communications for microservices, edge 
  platforms and devices
- Secure multi-tenancy in a single distributed communication 
  technology
- Transparent location addressing and discovery
- Resiliency with an emphasis on the overall health of the system
- Ease of use for agile development, CI/CD, and operations, at scale
- Highly scalable and performant with built-in load balancing and 
  dynamic auto-scaling
- Consistent identity and security mechanisms from edge devices to 
  backend services

NATS is simple and secure messaging made for developers and 
operators who want to spend more time developing modern applications 
and services than worrying about a distributed communication system.
- Easy to use for developers and operators
- High-Performance
- Always on and available
- Extremely lightweight
- At Most Once and At Least Once Delivery
- Support for Observable and Scalable Services and Event/Data 
  Streams
- Client support for over 30 different programming languages
- Cloud Native, a CNCF project with Kubernetes and Prometheus 
  integrations

Use Cases
NATS can run anywhere, from large servers and cloud instances, 
through edge gateways and even IoT devices. Use cases for NATS 
include:
- Cloud Messaging
  - Services (microservices, service mesh)
  - Event/Data Streaming (observability, analytics, ML/AI)
- Command and Control
  - IoT and Edge
  - Telemetry / Sensor Data / Command and Control
- Augmenting or Replacing Legacy Messaging Systems
"Lambda" Architecture
LinkedIn drops Lambda Arch to remove complexity
https://www.infoq.com/news/2020/12/linkedin-lambda-architecture/

OpenFaas (k8s)
@[https://www.openfaas.com/]
Serverless Functions, Made Simple.
OpenFaaS® makes it simple to deploy both functions and existing code to Kubernetes


CliG Dev
https://clig.dev/
Command Line Interface Guidelines
Debezium: Reacting to RDBM row changes
@[https://developers.redhat.com/blog/2020/12/11/debezium-serialization-with-apache-avro-and-apicurio-registry/]
- Debezium is a set of distributed services that captures row-level 
  database changes so that applications can view and respond to them. 
  Debezium connectors record all events to a Red Hat AMQ Streams Kafka 
  cluster. Applications use AMQ Streams to consume change events.
Apache AirFlow
https://dzone.com/articles/apache-airflow-20-a-practical-jump-start?edition=676391
- community driven platfom to programmatically author, schedule and monitor workflows. 

- It is NOT a data streaming solution. Tasks do not move data from one to the other
- Workflows are expected to be mostly static or slowly changing with workflows expected
  to look similar from a run to the next.

- "Batteries included" for lot of external apps:
    Amazon                Elasticsearch                                Opsgenie     Vertica
    Apache Beam           Exasol                                       Oracle       Yandex
    Apache Cassandra      Facebook                                     Pagerduty    Zendesk
    Apache Druid          File Transfer Protocol (FTP)                 Papermill
    Apache HDFS           Google                                       Plexus
    Apache Hive           gRPC                                         PostgreSQL
    Apache Kylin          Hashicorp                                    Presto
    Apache Livy           Hypertext Transfer Protocol (HTTP)           Qubole
    Apache Pig            Internet Message Access Protocol (IMAP)      Redis
    Apache Pinot          Java Database Connectivity (JDBC)            Salesforce
    Apache Spark          Jenkins                                      Samba
    Apache Sqoop          Jira                                         Segment
    Celery                Microsoft Azure                              Sendgrid
    IBM Cloudant          Microsoft SQL Server (MSSQL)                 SFTP
    Kubernetes            Windows Remote Management (WinRM)            Singularity
    Databricks            MongoDB                                      Slack
    Datadog               MySQL                                        Snowflake
    Dingding              Neo4J                                        SQLite
    Discord               ODBC                                         SSH
    Docker                OpenFaaS                                     Tableau
                                                                       Telegram