UPGRADE GUIDE
This document describes changes in behaviour to provide guidance to those upgrading from a previous version. Sections are titled to indicate changes needed when upgrading to that version. To upgrade across several versions, one needs to start at the version after the one installed, and heed all notifications for interim versions. While configuration language stability is an important goal, on occasion changes cannot be avoided. This file does not document new features, but only changes that cause concern during upgrades. The notices take the form:
- CHANGE
Indicates where configurations files must be changed to get the same behaviour as prior to release.
- ACTION
Indicates a maintenance activity required as part of an upgrade process.
- BUG
Indicates a bug serious to indicate that deployment of this version is not recommended.
- NOTICE
A behaviour change that will be noticeable during upgrade, but is no cause for concern.
- SHOULD
Indicates recommended interventions that are recommended, but not mandatory. If prescribed activity is not done, the consequence is either a configuration line that has no effect (wasteful) or the application may generate notification messages.
The sections in are entitled by the changes taking place at the level in question.
Installation Instructions
git
3.0.56
CHANGE: code refactor sarracenia.credentials… classes are now sarracenia.config.credentials any code using credentials need to be updated.
CHANGE: queue settings stored in subscriptions.json state file, instead of a .qname file, along with more information. Transition is perhaps complex. This version will read and write both files, so as to preserve ability to downgrade. later version will drop support for qname files.
CHANGE: in configuration files: subtopic must come after the relevant queue naming options (queueName, queueShare) in prior releases, the queue naming was a global setting. In a future version, one will be able to subscribe to multiple queues with a single subscriber.
3.0.54
CHANGE: sr3 sanity only restarts missing instances, not stopped ones. this is considered more in accordance with analyst expectations (POLA)
- CHANGE: new queueShare setting should be used wherever, in prior versions
queueName was used. Should result in fewer configuration settings and the queueShare settings used should also be shorter than former queueName ones.
CHANGE: default queue names changed from randomized value to one based on user name and host name.
Existing configurations without explicit queueName settings will continue to use old queues if they are already in use. as cleanup commands are executed, or in newly deployed configurations, the new naming will gradually take effect. The new queueShare setting can be used to customize that.
Please review all configurations that: * do not have explicit queueName settings AND * run on multiple hosts
To understand whether they need queueShare in order to keep the same sharing as previously obtained by default.
3.0.53
CHANGE: directory option in poll will no longer be converted to path silently. Use path explicitly instead. It is still converted when upgrading from v2 with sr3 convert, but in v3 configurations, directory now acts as it does in all other components as a download specifier.
3.0.52
CHANGE: Additional messageCountMax arugment to flowcb.gather() entry point. when implementing flow callbacks for scheduled flows, or poll overrides, the gather entry point now takes one additional argument indicating the maximum number of messages that the routine should return.
To be compatible with previous versions, one can establish a default value on the gather:
def gather(self, messageMaxCount=None):
With the default value, plugins are downward compatible. (earlier versions will call with only self as an argument.)
3.0.51
CHANGE: Additional action argument sarracenia.config.one_config() indicating how the configuration will be used. When used for readonly operations (status, show, dump) the configuration should avoid filling values that should only be defines when used. Examples have been updated appropriately.
CHANGE: action setting now mandatory for the sarracenia.config.finalize().
3.0.47
CHANGE: config option, strftime options, offset grammar changed: in v2 you had ${YYYYMMDD-70m}, in sr3 it should be ${%o-70m%Y%m%d} in 3.0.47, moved the time offset parsing to the beginning of the pattern.
CHANGE: default value of filename setting is now None instead of ‘WHATFN’, which reduces compatibility with Sundew, but makes behaviour less surprising when not using/familiar with Sundew. This None setting is the same as used by v2, so it should improve compatibility with sarracenia v2 configurations.
3.0.45
- CHANGE: config option: logRotateInterval units was days, is now
a time interval (seconds) like all other intervals.
3.0.41
CHANGE: v03 post format field renamed: “integrity” is now “identity”
current version will read messsages with integrity and map them to identity.
current version will post with identity, so older versions will miss them.
metpx-sr3c >= v3.23.06 (equivalent compatible C implementation)
metpx-sarracenia >= v2.23.06 (equivalent v2 compatible (legacy) version.)
3.0.40
- CHANGE: the default format in which messages are posted is v03, but as of this
version, to override the format, one must use post_format v02 prior to this version, setting of post_topicPrefix was sufficient. Now both settings are needed.
CHANGE: Python API breaking changes
for sarracenia.moth, now specify broker as options[‘broker’] instead of as a separate parameter:
before:
Moth(broker: url, options: dict, is_subsubscriber: bool) -> Config
pubFactory( broker, options ) -> Config
subFactory( broker, options ) -> Config
after:
Moth( options: dict, is_subscribe: bool) -> Config
pubFactory( options ) -> Config
subFactory( options ) -> Config
sarracenia.config API:
now should call sarracenia.config.finalize() after having set options and before being used. This routine reconciles the settings provided and builds some derived ones.
3.0.37
- BUG: sr3 cleanup does not work at all.
3.0.26
- CHANGE: event options (logEvents, and fileEvents) now replace previous value
used to be unioned (or’d) with previous value. now can preface the set elements with + to get the previous behaviour. Also - is available to remove an element from a set option. (sr3 convert now prefixes v2 values with +)
- CHANGE: fileEvents, new events present mkdir, and rmdir, some adjustment
to fileEvents settings may now be required.
3.0.25
- CHANGE: default value for acceptUnmatched is now True for all components.
prior to this release, default was False in subscribe component, and True for all others.
3.0.23
- NOTICE: now prefer strftime date specification in patterns, in place of
ones inherited from Sundew. converted by sr3 convert.
- CHANGE: removed please_stop_immediately option added in 3.0.22
(all components now stop more quickly, so not needed.)
3.0.22
CHANGE: destination, when used in a poll is replaced by pollUrl
CHANGE: destination, when used in a sender is replaced by sendTo
- ACTION: replace destination settings in affected configurations.
(automatically taken care of in v2 when converting.)
- NOTICE: when a file is renamed, sr3 has always only processed one of the two messages
produced to announce it, for compatibility with v2 naming. there is now an option: v2compatRenameDoublePost in sr3 to post only a single message when a file is renamed. This is now the default behaviour.
3.0.17
- CHANGE: The “Vendor” string is now “MetPX” instead of “science.gc.ca”.
This affects some file placement particularly on Windows.
- CHANGE: v03 notification message encoding changed: Identity checksum is now optional.
(details: https://github.com/MetPX/sarracenia/issues/547 ) md5sum is no longer defined, replaced with none in sr3.
- CHANGE: v03 notification message encoding changed for symbolic links, and file renames
and removals. There is now a ‘fileOp’ field for these dataless file operations. The Identity sum is now used exclusively for checksums.
3.0.15
- NOTICE: re-instating debian and windows packages by removing hard requirements for python modules
which are difficult to satisfy. From 3.0.15, dependencies are modular.
CHANGE: there are now four “extras” configured for pip packages for metpx-sr3.
amqp - ability to communicate with AMQP (rabbitmq) brokers
mqtt - ability to communicate with MQTT brokers
ftppoll - ability to poll FTP servers
vip - enable vip (Virtual IP) settings to implement singleton processing for high availability support.
with pip installation, one can include all the extras via:
pip install metpx-sr3[all]with Linux packages, install the corresponding native packages to activate the corresponding features
on Ubuntu, respectively:
apt install python3-amqp apt install python3-magic apt install python3-paramiko apt install python3-paho-mqtt apt install python3-dateparser python3-tz apt install python3-netifacessr3 looks for the relevant modules on startup and automatically enables support for the relevant features.
- CHANGE: file placement of denoting disabled configurations. it used to be that
~/.config/sr3/component/x.conf would be renamed x.conf.off when disabling. Now instead a file called ~/.cache/sr3/component/x/disabled is created. Configuration files are no longer changed by this sort of routine intervention.
3.0.14
initial beta.
- NOTICE: only pip packages currently work. No Debian packages on launchpad.net
nor any windows packages.
V2 to Sr3
- NOTICE: Sr3 is a very deep refactor of Sarracenia. For more detail on the nature
of the changes, go here Briefly, where v2 is an application written in python that had a small extension facility, Sr3 is a toolkit that naturally provides an API and is far more pythonic. Sr3 is built with less code, more maintainable code, and supports more features, and more naturally.
- CHANGE: log messages look completely different. Any log parsing will have to be reviewed.
New log format includes a prefix with process-id and the routine generating the notification message.
CHANGE: default message format in sr3 is v03. in v2, the default format was v2.
- CHANGE: default topicPrefix and post_topicPrefix in sr3 is ‘v03’ … in v2 it
was ‘v02.post’
- NOTICE: When migrating from v2 to sr3, simple configurations will mostly “just work.”
However, cases relying on user built plugins will require effort to port. The built-in plugins provided with Sarracenia have been ported as updated examples.
- CHANGE: file placement. On Linux: ~/.cache/sarra -> ~/.cache/sr3
~/.config/sarra -> ~/.config/sr3 Similar change on other platforms. The different placement allows to run both v2 and sr3 at the same time on the same server.
- NOTICE: to change configurations from v2 to sr3, rather than copying the file
from one directory to the other, use of the convert directive is recommended:
sr3 convert subscribe/mine.conf
will make all mechanical conversions of directive names from v2 to sr3 automatically. only custom plugin work need to be manually ported, as described below.
- NOTICE: In sr3 the winnowing or duplicate suppression algorithm (implemented by sarracenia.flowcb.nodupe.NoDupe.py)
is separate from the data source’s checksum algorithm.
In v2, the checksum algorithm had to be harmonized with the data source checksum. In sr3 one can select any checksumming method, and still customize how message key and path are selected to allow for full customization of duplicate suppression.
- CHANGE: Command line interface (CLI) is different. There is only one main entry_point: sr3.
so most invocations are different in a pattern like so:
sr_subscribe start config -> sr3 start subscribe/config
in sr3 one can specify a series of configurations to operate on in a single command:
sr3 start poll/airnow subscribe/airnow sender/cmqb
- CHANGE: in sr3, use – for full word options, like –config, or –broker. In v2 you
could use -config and -broker, but single dash is reserved for single character options. This is a result of sr3 using python standard ArgParse class:
-config hoho.conf -> in v2 refers to loading the hoho.conf file as a configuration.
In sr3, it will be interpreted as -c (config) load the onfig.conf file, and hoho.conf is part of some subsequent option. in sr3:
--config hoho.conf
does that as intended.
CHANGE: sr3 poll works very differently from v2.
v2 behaviour
sr3 behaviour
all participants in a vip poll remote always
One node (with vip) polls remote.
all participants in a vip update ls_files
nodes subscribe to the output exchange
poll builds strings to describe files
poll builds stat(2) like paramiko.SftpAttributes()
participants rely on their ls_files for state
poll uses flowcb.nodupe module like rest of sr3
file_time_limit to ignore older files
fileAgeMax
destination gives where to poll
pollUrl
directory gives remote directory to list
path used like in post and watch
need accept per directory
need only one accept
get is a sort of remote pattern filtering
accept same as used by all other components.
do_poll plugins used to override default
poll entry point in flow callbacks
do_poll used to HTTP GET periodically
flowcb.scheduled more elegant.
The sr3 convert function takes care of the necessary configuration changes, but plugins need ground up rewrites, as they work completely differently.
All of the changes makes poll’s use of the configuration language less different than how it is used in other components. For example, directory was confusing because it is used to determine the source directory to be polled. In all other components it refers to the download location. The path option replaces it, poll uses it the same post and watch do: to denote the paths that should be observed.
In sr3 when vip setting is present, poll will create a queue bound to the post_broker/post_exchange in order to see the posts done by other participants in the queue. queue naming options are therefore useful in sr3
CHANGE: In general, underscores in options are replaced with camelCase. e.g.:
v2 loglevel -> sr3 logLevel
v2 options that are renamed will be understood, but an informational message will be produced on startup. Underscore is still use for grouping purposes. Options which have changed:
v2 Option
v3 Option
accel_scp_threshold
accelThreshold
accel_wget_threshold
accelThreshold
accept_unmatch
acceptUnmatched
accept_unmatched
acceptUnmatched
base_dir
baseDir
basedir
baseDir
baseurl
baseUrl
bind_queue
queueBind
cache
nodupe_ttl
cache_basis
nodupe_basis
caching
nodupe_ttl
chmod
permDefault
chmod_dir
permDirDefault
chmod_log
permLog
declare_exchange
exchangeDeclare
declare_queue
queueDeclare
default_dir_mode
permDirDefault
default_log_mode
permLog
default_mode
permDefault
destination
pollUrl in Poll
destination
sendTo in Sender
document_root
documentRoot
e
fileEvents
events
fileEvents
exchange_split
exchangeSplit
file_time_limit
fileAgeMax
hb_memory_baseline_file
MemoryBaseLineFile
hb_memory_max
MemoryMax
hb_memory_multiplier
MemoryMultiplier
heartbeat
housekeeping
instance
instances
ll
logLevel
logRotate
logRotateCount
logRotate_interval
logRotateInterval
log_format
logFormat
log_reject
logReject
logdays
logRotateCount
loglevel
logLevel
no_duplicates
nodupe_ttl
post_base_dir
post_baseDir
post_base_url
post_baseUrl
post_basedir
post_baseDir
post_baseurl
post_baseUrl
post_document_root
post_documentRoot
post_exchange_split
post_exchangeSplit
post_rate_limit
messageRateMax
post_topic_prefix
post_topicPrefix
preserve_mode
permCopy
preserve_time
timeCopy
queue_name
queueName
report_back
report
source_from_exchange
sourceFromExchange
sum
identity
suppress_duplicates
nodupe_ttl
suppress_duplicates_basis
nodupe_basis
topic_prefix
topicPrefix
- CHANGE: default topic_prefix v02.post -> topicPrefix v03
may need to change configurations to override default to get compatible configurations.
- CHANGE: v2: mirror defaults to False on all components except sarra.
sr3: mirror defaults to True on all components except subscribe.
- NOTICE: The most common v2 plugins are on_message, and on_file
(as per plugin and on_ directives in v2 configuration files) which can be honoured via the v2wrapper sr3 plugin class Many other plugins were ported, and the the configuration module recognizes the old configuration settings and they are interpreted in the new style. the known conversions can be viewed by starting a python interpreter:
Python 3.8.10 (default, Nov 26 2021, 20:14:08) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sarracenia.config,pprint >>> pp=pprint.PrettyPrinter() >>> pp.pprint(sarracenia.config.convert_to_v3) { 'do_send': { 'file_email': ['flowCallback', 'sarracenia.flowcb.send.email.Email'] }, 'ls_file_index': ['continue'], 'no_download': ['download', 'False'], 'notify_only': ['download', 'False'], 'on_message':{ 'msg_2http': ['flow_callback', 'sarracenia.flowcb.accept.tohttp.ToHttp'], 'msg_2local': ['flow_callback', 'sarracenia.flowcb.accept.tolocal.ToLocal'], 'msg_2localfile': ['flow_callback', 'sarracenia.flowcb.accept.tolocalfile.ToLocalFile'], 'msg_WMO_type_suffix': ['flow_callback', 'sarracenia.flowcb.accept.wmotypesuffix.WmoTypeSuffix'], 'msg_by_source': ['continue'], 'msg_by_user': ['continue'], 'msg_delay': ['flow_callback', 'sarracenia.flowcb.accept.messagedelay.MessageDelay'], 'msg_delete': ['flow_callback', 'sarracenia.flowcb.filter.deleteflowfiles.DeleteFlowFiles'], 'msg_download': ['continue'], 'msg_download_baseurl': ['flow_callback', 'sarracenia.flowcb.accept.downloadbaseurl.DownloadBaseUrl'], 'msg_dump': ['continue'], 'msg_fdelay': ['continue'], 'msg_from_cluster': ['continue'], 'msg_gts2wistopic': ['continue'], 'msg_hour_tree': ['flow_callback', 'sarracenia.flowcb.accept.hourtree.HourTree'], 'msg_http_to_https': ['flow_callback', 'sarracenia.flowcb.accept.httptohttps.HttpToHttps'], 'msg_log': ['logEvents', 'after_accept'], 'msg_overwrite_sum': ['continue'], 'msg_print_lag': ['flow_callback', 'sarracenia.flowcb.accept.printlag.PrintLag'], 'msg_rawlog': ['logEvents', 'after_accept'], 'msg_rename4jicc': ['flow_callback', 'sarracenia.flowcb.accept.rename4jicc.Rename4Jicc'], 'msg_rename_dmf': ['flow_callback', 'sarracenia.flowcb.accept.renamedmf.RenameDMF'], 'msg_rename_whatfn': ['flow_callback', 'sarracenia.flowcb.accept.renamewhatfn.RenameWhatFn'], 'msg_renamer': ['flow_callback', 'sarracenia.flowcb.accept.renamer.Renamer'], 'msg_save': ['flow_callback', 'sarracenia.flowcb.accept.save.Save'], 'msg_skip_old': ['flow_callback', 'sarracenia.flowcb.accept.skipold.SkipOld'], 'msg_speedo': ['flow_callback', 'sarracenia.flowcb.accept.speedo.Speedo'], 'msg_stdfiles': ['continue'], 'msg_stopper': ['continue'], 'msg_sundew_pxroute': ['flow_callback', 'sarracenia.flowcb.accept.sundewpxroute.SundewPxRoute'], 'msg_test_retry': ['flow_callback', 'sarracenia.flowcb.accept.testretry.TestRetry'], 'msg_to_clusters': ['flow_callback', 'sarracenia.flowcb.accept.toclusters.ToClusters'], 'msg_total': ['continue'], 'msg_total_save': ['continue'], 'post_hour_tree': ['flow_callback', 'sarracenia.flowcb.accept.posthourtree.PostHourTree'], 'post_long_flow': ['flow_callback', 'sarracenia.flowcb.accept.longflow.LongFLow'], 'post_override': ['flow_callback', 'sarracenia.flowcb.accept.postoverride.PostOverride'], 'post_total': ['continue'], 'post_total_save': ['continue'], 'wmo2msc': ['flow_callback', 'sarracenia.flowcb.filter.wmo2msc.Wmo2Msc'] }, 'on_post': { 'post_log': ['logEvents', 'after_work'] }, 'plugin': { 'accel_scp': ['continue'], 'accel_wget': ['continue'], 'msg_fdelay': ['flowCallback', 'sarracenia.flowcb.filter.fdelay.FDelay'], 'msg_pclean_f90': ['flowCallback', 'sarracenia.flowcb.filter.pclean_f90.PClean_F90'], 'msg_pclean_f92': ['flowCallback', 'sarracenia.flowcb.filter.pclean_f92.PClean_F92'] }, 'windows_run': ['continue'], 'xattr_disable': ['continue'] } >>>
The options listed as ‘continue’ are obsolete ones, superceded by default processing, or rendered unnecessary by changes in the implementation.
- NOTICE: for API users and plugin writers, the v2 plugin format is entirely replaced by
the Flow Callback class. New plugin functionality can mostly be implemented as plugins.
- CHANGE: the v2 do_poll plugins must be replaced by subclassing for poll
Example in plugin porting
CHANGE: The v2 on_html_page plugins are also replaced by subclassing poll
CHANGE: v2 do_send replaced by send entrypoint in a Flowcb plugin plugin porting
- NOTICE: the v2 accellerator plugins are replaced by built-in accelleration.
accel_wget_command, accel_scp_command, accel_ftpget_command, accel_ftpput_command, accel_scp_command, are now built-in options used by the Transfer class. Adding new transfer protocols is done by sub-classing Transfer.
SHOULD: v2 on_message -> after_accept should be re-written plugin porting
SHOULD: v2 on_file -> after_work should be re-written plugin porting
- SHOULD: v2 plugins should to be re-written. plugin porting
there are many built-in plugins that are ported and automatically converted, but external ones must be re-written.
There are some performance consequences from this compatibility however, so high traffic flows will run with less cpu and memory load if the plugins are ported to sr3. To build native sr3 plugins, One should investigate the flowCallback (flowcb) class.
CHANGE: on_watch plugins entry_point becomes an sr3 after_accept entrypoint in a flowcb in a watch.
- ACTION: The sr_audit component is gone. Replaced by running sr sanity as a cron
job (or scheduled task on windows.) to make sure that necessary processes continue to run.
- CHANGE: obsolete settings: use_amqplib, use_pika. the new sarracenia.moth.amqp
uses the amqp library. To use other libraries, one should create new subclasses of sarracenia.moth.
- CHANGE: statehost is now a boolean flag, fqdn option no longer implemented.
if this is a problem, submit an issue. It’s just not considered worthwhile for now.
- CHANGE: sr_retry became retry.py.
Any plugins accessing internal structures of sr_retry.py need to be re-written. This access is no longer necessary, as the API defines how to put notification messages on the retry queue (move notification messages to worklist.failed. )
- NOTICE: sr3 watch, with the force_polling option, is much less efficient
on sr3 than v2 for large directory trees (see issue #403 ) Ideally, one does not use force_polling at all.