Metadata-Version: 2.1
Name: insights-analytics-collector
Version: 0.3.2
Summary: Collector Package for Insights for AAP
Home-page: https://github.com/RedHatInsights/insights-analytics-collector/
Author: Martin Slemr
Author-email: mslemr@redhat.com
License: Apache
Description: # Insights Analytics Collector
        
        This package helps with collecting data by user-defined collector methods. 
        It packs collected data to one or more tarballs and sends them to user-defined URL.
        
        Some data and classes has to be implemented.
        By function:
        - persisting settings
        - data like credentials, content type etc. for shipping (POST request)
        
        By Classes:
        - Collector
        - Package
        - collector_module: 
          - functions with `@register` decorator, one with `config=True, format='json'`
          - slicing functions (optional) for splitting large data (db tables) by time intervals
        
        ## Collector
        
        Entrypoint with "gather()" method. 
        
        ### Implementation
        
        Collector is an Abstract class, implement abstract methods.
        
        - `_package_class`: Returns class of your implementation of Package   
        - `_is_valid_license`: Check for valid license specific to your service
        - `_is_shipping_configured`: Check if shipping to cloud is configured
        - `_last_gathering`: returns datetime. Loading last successful run from some persistent storage
        - `_save_last_gather`: Persisting last successful run
        - `_load_last_gathered_entries`: Has to fill dictionary `self.last_gathered_entries`. Load from persistent storage 
          Dict contains keys equal to collector's registered functions' keys (with @register decorator)
        - `_save_last_gathered_entries`: Persisting `self.last_gathered_entries` 
        
        An example can be found in [Test collector](tests/classes/analytics_collector.py)
        
        ## Package
        
        One package represents one `.tar.gz` file which will be uploaded to Analytics.
        Registered collectors are placed to collections as JSON/CSV files this way:
        
        - Upload limit is 100MB. The maximum bytes of uncompressed data is MAX_DATA_SIZE (by guess 200MB, redefine if needed)
        - JSON collectors are processed first, it's not expected they'll exceed this size
          - if yes, use CSV format instead
        - CSV files can be collected in two modes:
          - with slicing function 
            - splitting data by custom function - usually time interval
            - the purpose is to have reasonable SQL query in big databases
            - `@register(fnc_slicing=...)`
          - without slicing function
          - 
        - CSV files are expected to be large (db data), so they can be split by `CsvFileSplitter` in the collector function.
        
        How are files included into packages:
        - JSON files are in first package
        - CSVs without slicing are included to first free package with enough size (can be added to JSON files)
          - if function collects i.e. 900MB, it's sent in first 5 packages
          - two functions cannot have the same name in `@register()` decorator
        - CSVs with slicing are sent after each slice is collected (with respect to smaller volume size if running in OpenShift/docker)
          - each slice can be also split by CsvFileSplitter, if bigger than MAX_DATA_SIZE
            - then each part of slice is sent immediately
          - two functions can have the same name in `@register()` decorator
        
        Number of packages (tarballs) is bigger of:
        - number of files collected by one biggest registered CSV collector without slicing
        - number of files collected by all registered CSV collectors with slicing
        - can be +1 for JSON files
        
        See the [test_gathering.py](tests/functional/test_gathering.py) for details
        
        ### Implementation
        
        Package is also abstract class. You have to implement basically info for POST request to cloud.
        
        - `PAYLOAD_CONTENT_TYPE`: contains registered content type for cloud's ingress service
        - `MAX_DATA_SIZE`: maximum size in bytes of **uncompressed** data for one tarball. Ingress limits uploads to 100MB. Defaults to
          200MB.
        - `get_ingress_url`: Cloud's ingress service URL
        - `_get_rh_user`: User for POST request 
        - `_get_rh_password`: Password for POST request
        - `_get_x_rh_identity`: X-RH Identity Used for local testing instead of user and password
        - `_get_http_request_headers`: Dict with any custom headers for POST request 
         
        An example can be found in [Test package](tests/classes/package.py)
        
        ## Collector module
        
        Module with gathering functions is the main part you need to implement.
        It should contain functions returning data either in `dict` format or list of CSV files.
        
        Function is registered by `@register` decorator:
        ```python
        from insights_analytics_collector import register
        
        @register('json_data', '1.0', format='json', description="Data description")
        def json_data(**kwargs):
            return {'my_data': 'True'}
        ```
        
        Decorator `@register` has following attributes:
        - **key**: (string) name of output file (usually the same as function name)
        - **version**: (string) i.e. '1.0'. Version of data - added to the manifest.json for parsing on cloud's side
        - **description**: (string)  not used yet
        - **format**: (string) Default: 'json' extension of output file, can be "json" of "csv". Also determines function output. 
        - **config**: (bool) Default: False. there **has to be one** function with `config=True, format=json`
        - **fnc_slicing**: Intended for large data. Described in [Slicing function](#slicing-function) below 
        - **shipping_group**: (string) Default: 'default'. Splits data to packages by group, if required.
        
        
        ```python
        from <your-namespace> import Collector  # your implementation
        
        collector = Collector
        collector.gather()
        
        ```
        
        ### Slicing function
        
        ## Collectors
        
        
        ## Registered collectors
        
        
        ## Abstract classes
        
        
        ## Tarballs
        
Platform: UNKNOWN
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Description-Content-Type: text/markdown
License-File: LICENSE.md
