docs/user-guide.rst

   1
   2
   3 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
   4 .. SPDX-License-Identifier: CC-BY-4.0
   5 .. CAUTION: this document is generated from source in doc/src/rtd.
   6 .. To make changes edit the source and recompile the document.
   7 .. Do NOT make changes directly to .rst or .md files.
   8
   9
  10 ============================================================================================
  11 RIC Message Router -- RMR
  12 ============================================================================================
  13 --------------------------------------------------------------------------------------------
  14 User's Manual
  15 --------------------------------------------------------------------------------------------
  16
  17 Overview
  18 ============================================================================================
  19
  20 The RIC Message Router (RMR) is a library for peer-to-peer communication.
  21 Applications use the library to send and receive messages where the message
  22 routing and endpoint selection is based on the message type rather than DNS host
  23 name-IP port combinations. The library provides the following major features:
  24
  25
  26 + Routing and endpoint selection is based on *message type.*
  27
  28 + Application is insulated from the underlying transport mechanism and/or protocols.
  29
  30 + Message distribution (round robin or fanout) is selectable by message type.
  31
  32 + Route management updates are received and processed asynchronously and without overt application involvement.
  33
  34
  35
  36 Purpose
  37 --------------------------------------------------------------------------------------------
  38
  39 RMR's main purpose is to provide an application with the
  40 ability to send and receive messages to/from other peer
  41 applications with minimal effort on the application's part.
  42 To achieve this, RMR manages all endpoint information,
  43 connections, and routing information necessary to establish
  44 and maintain communication. From the application's point of
  45 view, all that is required to send a message is to allocate
  46 (via RMR) a message buffer, add the payload data, and set the
  47 message type. To receive a message, the application needs
  48 only to invoke the receive function; when a message arrives a
  49 message buffer will be returned as the function result.
  50
  51 Message Routing
  52 --------------------------------------------------------------------------------------------
  53
  54 Applications are required to place a message type into a
  55 message before sending, and may optionally add a subscription
  56 ID when appropriate. The combination of message type, and
  57 subscription ID are refered to as the *message key,* and is
  58 used to match an entry in a routing table which provides the
  59 possible endpoints expecting to receive messages with the
  60 matching key.
  61
  62 Round Robin Delivery
  63 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  64
  65 An endpoint from RMR's perspective is an application to which
  66 RMR may establish a connection, and expect to send messages
  67 with one or more defined message keys. Each entry in the
  68 route table consists of one or more endpoint groups, called
  69 round robin groups. When a message matches a specific entry,
  70 the entry's groups are used to select the destination of the
  71 message. A message is sent once to each group, with messages
  72 being *balanced* across the endpoints of a group via round
  73 robin selection. Care should be taken when defining multiple
  74 groups for a message type as there is extra overhead required
  75 and thus the overall message latency is somewhat increased.
  76
  77 Routing Table Updates
  78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  79
  80 Route table information is made available to RMR a static
  81 file (loaded once), or by updates sent from a separate route
  82 manager application. If a static table is provided, it is
  83 loaded during RMR initialization and will remain in use until
  84 an external process connects and delivers a route table
  85 update (often referred to as a dynamic update). Dynamic
  86 updates are listened for in a separate process thread and
  87 applied automatically; the application does not need to allow
  88 for, or trigger, updates.
  89
  90 Latency And Throughput
  91 --------------------------------------------------------------------------------------------
  92
  93 While providing insulation from the underlying message
  94 transport mechanics, RMR must also do so in such a manner
  95 that message latency and throughput are not impacted. In
  96 general, the RMR induced overhead, incurred due to the
  97 process of selecting an endpoint for each message, is minimal
  98 and should not impact the overall latency or throughput of
  99 the application. This impact has been measured with test
 100 applications running on the same physical host and the
 101 average latency through RMR for a message was on the order of
 102 0.02 milliseconds.
 103
 104 As an application's throughput increases, it becomes easy for
 105 the application to overrun the underlying transport mechanism
 106 (e.g. NNG), consume all available TCP transmit buffers, or
 107 otherwise find itself in a situation where a send might not
 108 immediately complete. RMR offers different *modes* which
 109 allow the application to manage these states based on the
 110 overall needs of the application. These modes are discussed
 111 in the *Configuration* section of this document.
 112
 113 General Use
 114 ============================================================================================
 115
 116 To use, the RMR based application simply needs to initialise
 117 the RMR environment, wait for RMR to have received a routing
 118 table (become ready), and then invoke either the send or
 119 receive functions. These steps, and some behind the scenes
 120 details, are described in the following paragraphs.
 121
 122 Initialisation
 123 --------------------------------------------------------------------------------------------
 124
 125 The RMR function is used to set up the RMR environment and
 126 must be called before messages can be sent or received. One
 127 of the few parameters that the application must communicate
 128 to RMR is the port number that will be used as the listen
 129 port for new connections. The port number is passed on the
 130 initialisation function call and a TCP listen socket will be
 131 opened with this port. If the port is already in use RMR will
 132 report a failure; the application will need to reinitialise
 133 with a different port number, abort, or take some other
 134 action appropriate for the application.
 135
 136 In addition to creating a TCP listen port, RMR will start a
 137 process thread which will be responsible for receiving
 138 dynamic updates to the route table. This thread also causes a
 139 TCP listen port to be opened as it is expected that the
 140 process which generates route table updates will connect and
 141 send new information when needed. The route table update port
 142 is **not** supplied by the application, but is supplied via
 143 an environment variable as this value is likely determined by
 144 the mechanism which is starting and configuring the
 145 application.
 146
 147 The RMR Context
 148 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 149
 150 On successful initialisation, a void pointer, often called a
 151 *handle* by some programming languages, is returned to the
 152 application. This is a reference to the RMR control
 153 information and must be passed as the first parameter on most
 154 RMR function calls. RMR refers to this as the context, or
 155 ctx.
 156
 157 Wait For Ready
 158 --------------------------------------------------------------------------------------------
 159
 160 An application which is only receiving messages does not need
 161 to wait for RMR to *become ready* after the call to the
 162 initialization function. However, before the application can
 163 successfully send a message, RMR must have loaded a route
 164 table, and the application must wait for RMR to report that
 165 it has done so. The RMR function will return the value *true*
 166 (1) when a complete route table has been loaded and can be
 167 used to determine the endpoint for a send request.
 168
 169 Receiving Messages
 170 --------------------------------------------------------------------------------------------
 171
 172 The process of receiving is fairly straight forward. The
 173 application invokes the RMR function which will block until a
 174 message is received. The function returns a pointer to a
 175 message block which provides all of the details about the
 176 message. Specifically, the application has access to the
 177 following information either directly or indirectly:
 178
 179
 180 + The payload (actual data)
 181
 182 + The total payload length in bytes
 183
 184 + The number of bytes of the payload which contain valid data
 185
 186 + The message type and subscription ID values
 187
 188 + The hostname and IP address of the source of the message (the sender)
 189
 190 + The transaction ID
 191
 192 + Tracing data (if provided)
 193
 194
 195
 196 The Message Payload
 197 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 198
 199 The message payload contains the *raw* data that was sent by
 200 the peer application. The format will likely depend on the
 201 message type, and is expected to be known by the application.
 202 A direct pointer to the payload is available from the message
 203 buffer (see appendix B for specific message buffer details).
 204
 205 Two payload-related length values are also directly
 206 available: the total payload length, and the number of bytes
 207 actually filled with data. The used length is set by the
 208 caller, and may or not be an accurate value. The total
 209 payload length is determined when the buffer is created for
 210 sending, and is the maximum number of bytes that the
 211 application may modify should the buffer be used to return a
 212 response.
 213
 214 Message Type and Subscription ID
 215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 216
 217 The message type and subscription ID are both directly
 218 available from the message buffer, and are the values which
 219 were used to by RMR in the sending application to select the
 220 endpoint. If the application resends the message, as opposed
 221 to returning the message buffer as a response, the message
 222 number and/or the subscription ID might need to be changed to
 223 avoid potential issues[1].
 224
 225 Sender Information
 226 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 227
 228 The source, or sender information, is indirectly available to
 229 the application via the and functions. The former returns a
 230 string containing hostname:port, while the string ip:port is
 231 returned by the latter.
 232
 233 Transaction ID
 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 235
 236 The message buffer contains a fixed length set of bytes which
 237 applications can set to track related messages across the
 238 application concept of a transaction. RMR will use the
 239 transaction ID for matching a response message when the
 240 function is used to send a message.
 241
 242 Trace Information
 243 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 244
 245 RMR supports the addition of an optional trace information to
 246 any message. The presence and size is controlled by the
 247 application, and can vary from message to message if desired.
 248 The actual contents of the trace information is determined by
 249 the application; RMR provides only the means to set, extract,
 250 and obtain a direct reference to the trace bytes. The trace
 251 data field in a message buffer is discussed in greater detail
 252 in the *Trace Data* section.
 253
 254 Sending Messages
 255 --------------------------------------------------------------------------------------------
 256
 257 Sending requires only slightly more work on the part of the
 258 application than receiving a message. The application must
 259 allocate an RMR message buffer, populate the message payload
 260 with data, set the message type and length, and optionally
 261 set the subscription ID. Information such as the source IP
 262 address, hostname, and port are automatically added to the
 263 message buffer by RMR, so there is no need for the
 264 application to worry about these.
 265
 266 Message Buffer Allocation
 267 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 268
 269 The function allocates a *zero copy* buffer and returns a
 270 pointer to the RMR rmr_mbuf_t structure. The message buffer
 271 provides direct access to the payload, length, message type
 272 and subscription ID fields. The buffer must be preallocated
 273 in order to allow the underlying transport mechanism to
 274 allocate the payload space from its internal memory pool;
 275 this eliminates multiple copies as the message is sent, and
 276 thus is more efficient.
 277
 278 If a message buffer has been received, and the application
 279 wishes to use the buffer to send a response, or to forward
 280 the buffer to another application, a new buffer does **not**
 281 need to be allocated. The application may set the necessary
 282 information (message type, etc.), and adjust the payload, as
 283 is necessary and then pass the message buffer to or to be
 284 sent or returned to the sender.
 285
 286 Populating the Message Buffer
 287 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 288
 289 The application has direct access to several of the message
 290 buffer fields, and should set them appropriately.
 291
 292
 293
 294 len
 295
 296   This is the number of bytes that the application placed
 297   into the payload. Setting length to 0 is allowed, and
 298   length may be less than the allocated payload size.
 299
 300
 301 mtype
 302
 303   The message type that RMR will use to determine the
 304   endpoint used as the target of the send.
 305
 306
 307 sub_id
 308
 309   The subscription ID if the message is to be routed based
 310   on the combination of message type and subscription ID. If
 311   no subscription ID is valid for the message, the
 312   application should set the field with the RMR constant
 313   RMR_VOID_SUBID.
 314
 315
 316 payload
 317
 318   The application should obtain the reference (pointer) to
 319   the payload from the message buffer and place any data
 320   into the payload. The application is responsible for
 321   ensuring that the maximum payload size is not exceeded.
 322   The application may obtain the maximum size via the
 323   function.
 324
 325
 326 trace data
 327
 328   Optionally, the application may add trace information to
 329   the message buffer.
 330
 331
 332
 333 Sending a Message Buffer
 334 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 335
 336 Once the application has populated the necessary bits of a
 337 message, it may be sent by passing the buffer to the
 338 function. This function will select an endpoint to receive
 339 the message, based on message type and subscription ID, and
 340 will pass the message to the underlying transport mechanism
 341 for actual transmission on the connection. (Depending on the
 342 underlying transport mechanism, the actual connection to the
 343 endpoint may happen at the time of the first message sent to
 344 the endpoint, and thus the latency of the first send might be
 345 longer than expected.)
 346
 347 On success, the send function will return a reference to a
 348 message buffer; the status within that message buffer will
 349 indicate what the message buffer contains. When the status is
 350 RMR_OK the reference is to a **new** message buffer for the
 351 application to use for the next send; the payload size is the
 352 same as the payload size allocated for the message that was
 353 just sent. This is a convenience as it eliminates the need
 354 for the application to call the message allocation function
 355 at some point in the future, and assumes the application will
 356 send many messages which will require the same payload
 357 dimensions.
 358
 359 If the message contains any status other than RMR_OK, then
 360 the message could **not** be sent, and the reference is to
 361 the unsent message buffer. The value of the status will
 362 indicate whether the nature of the failure was transient (
 363 RMR_ERR_RETRY) or not. Transient failures are likely to be
 364 successful if the application attempts to send the message at
 365 a later time. Unfortunately, it is impossible for RMR to know
 366 the exact transient failure (e.g. connection being
 367 established, or TCP buffer shortage), and thus it is not
 368 possible to communicate how long the application should wait
 369 before attempting to resend, if the application wishes to
 370 resend the message. (More discussion with respect to message
 371 retries can be found in the *Handling Failures* section.)
 372
 373 Advanced Usage
 374 ============================================================================================
 375
 376 Several forms of usage fall into a more advanced category and
 377 are described in the following sections. These include
 378 blocking call, return to sender and wormhole functions.
 379
 380 The Call Function
 381 --------------------------------------------------------------------------------------------
 382
 383 The RMR function sends a message in the exact same manner as
 384 the rmr_send_msg() function, with the endpoint selection
 385 based on the message key. But unlike the send function, will
 386 block and wait for a response from the application that is
 387 selected to receive the message. The matching message is
 388 determined by the transaction ID which the application must
 389 place into the message buffer prior to invoking. Similarly,
 390 the responding application must ensure that the same
 391 transaction ID is placed into the message buffer before
 392 returning its response.
 393
 394 The return from the call is a message buffer with the
 395 response message; there is no difference between a message
 396 buffer returned by the receive function and one returned by
 397 the function. If a response is not received in a reasonable
 398 amount of time, a nil message buffer is returned to the
 399 calling application.
 400
 401 Returning a Response
 402 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 403
 404 Because of the nature of RMR's routing policies, it is
 405 generally not possible for an application to control exactly
 406 which endpoint is sent a message. There are cases, such as
 407 responding to a message delivered via that the application
 408 must send a message and guarantee that RMR routes it to an
 409 exact destination. To enable this, RMR provides the return to
 410 sender, function. Upon receipt of any message, an application
 411 may alter the payload, and if necessary the message type and
 412 subscription ID, and pass the altered message buffer to the
 413 function to return the altered message to the application
 414 which sent it. When this function is used, RMR will examine
 415 the message buffer for the source information and use that to
 416 select the connection on which to write the response.
 417
 418 Multi-threaded Calls
 419 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 420
 421 The basic call mechanism described above is **not** thread
 422 safe, as it is not possible to guarantee that a response
 423 message is delivered to the correct thread. The RMR function
 424 accepts an additional parameter which identifies the calling
 425 thread in order to ensure that the response is delivered
 426 properly. In addition, the application must specifically
 427 initialise the multi-threaded call environment by passing the
 428 RMRFL_MTCALL flag as an option to the function[2].
 429
 430 One advantage of the multi-threaded call capability in RMR is
 431 the fact that only the calling thread is blocked. Messages
 432 received which are not responses to the call are continued to
 433 be delivered via normal calls.
 434
 435 While the process is blocked waiting for the response, it is
 436 entirely possible that asynchronous, non-matching, messages
 437 will arrive. When this happens, RMR will queues the messages
 438 and return them to the application over the next calls to
 439
 440 Wormholes
 441 --------------------------------------------------------------------------------------------
 442
 443 As was mentioned earlier, the design of RMR is to eliminate
 444 the need for an application to know a specific endpoint, even
 445 when a response message is being sent. In some rare cases it
 446 may be necessary for an application to establish a direct
 447 connection to an RMR-based application rather than relying on
 448 message type and subscription ID based routing. The
 449 *wormhole* functions provide an application with the ability
 450 to create a direct connection and then to send and receive
 451 messages across the connection. The following are the RMR
 452 functions which provide wormhole communications:
 453
 454
 455
 456 rmr_wh_open
 457
 458   Open a connection to an endpoint. Name or IP address and
 459   port of the endpoint is supplied. Returns a wormhole ID
 460   that the application must use when sending a direct
 461   message.
 462
 463
 464 rmr_wh_send_msg
 465
 466   Sends an RMR message buffer to the connected application.
 467   The message type and subscription ID may be set in the
 468   message, but RMR will ignore both.
 469
 470
 471 rmr_wh_close
 472
 473   Closes the direct connection.
 474
 475
 476
 477 Handling Failures
 478 ============================================================================================
 479
 480 The vast majority of states reported by RMR are fatal; if
 481 encountered during setup or initialization, then it is
 482 unlikely that any message oriented processing should
 483 continue, and when encountered on a message operation
 484 continued operation on that message should be abandoned.
 485 Specifically with regard to message sending, it is very
 486 likely that the underlying transport mechanism will report a
 487 *soft,* or transient, failure which might be successful if
 488 the operation is retried at a later point in time. The
 489 paragraphs below discuss the methods that an application
 490 might deal with these soft failures.
 491
 492 Failure Notification
 493 --------------------------------------------------------------------------------------------
 494
 495 When a soft failure is reported, the returned message buffer
 496 returned by the RMR function will be RMR_ERR_RETRY. These
 497 types of failures can occur for various reasons; one of two
 498 reasons is typically the underlying cause:
 499
 500
 501 + The session to the targeted recipient (endpoint) is not connected.
 502
 503 + The transport mechanism buffer pool is full and cannot accept another buffer.
 504
 505
 506
 507 Unfortunately, it is not possible for RMR to determine which
 508 of these two cases is occurring, and equally as unfortunate
 509 the time to resolve each is different. The first, no
 510 connection, may require up to a second before a message can
 511 be accepted, while a rejection because of buffer shortage is
 512 likely to resolve in less than a millisecond.
 513
 514 Application Response
 515 --------------------------------------------------------------------------------------------
 516
 517 The action which an application takes when a soft failure is
 518 reported ultimately depends on the nature of the application
 519 with respect to factors such as tolerance to extended message
 520 latency, dropped messages, and over all message rate.
 521
 522 RMR Retry Modes
 523 --------------------------------------------------------------------------------------------
 524
 525 In an effort to reduce the workload of an application
 526 developer, RMR has a default retry policy such that RMR will
 527 attempt to retransmit a message up to 1000 times when a soft
 528 failure is reported. These retries generally take less than 1
 529 millisecond (if all 1000 are attempted) and in most cases
 530 eliminates nearly all reported soft failures to the
 531 application. When using this mode, it might allow the
 532 application to simply treat all bad return values from a send
 533 attempt as permanent failures.
 534
 535 If an application is so sensitive to any delay in RMR, or the
 536 underlying transport mechanism, it is possible to set RMR to
 537 return a failure immediately on any kind of error (permanent
 538 failures are always reported without retry). In this mode,
 539 RMR will still set the state in the message buffer to
 540 RMR_ERR_RETRY, but will **not** make any attempts to resend
 541 the message. This zero-retry policy is enabled by invoking
 542 the with a value of 0; this can be done once immediately
 543 after is invoked.
 544
 545 Regardless of the retry mode which the application sets, it
 546 will ultimately be up to the application to handle failures
 547 by queuing the message internally for resend, retrying
 548 immediately, or dropping the send attempt all together. As
 549 stated before, only the application can determine how to best
 550 handle send failures.
 551
 552 Other Failures
 553 --------------------------------------------------------------------------------------------
 554
 555 RMR will return the state of processing for message based
 556 operations (send/receive) as the status in the message
 557 buffer. For non-message operations, state is returned to the
 558 caller as the integer return value for all functions which
 559 are not expected to return a pointer (e.g. and a brief
 560 description of their meaning.
 561
 562
 563
 564 RMR_OK
 565
 566   state is good; operation finished successfully
 567
 568
 569 RMR_ERR_BADARG
 570
 571   argument passed to function was unusable
 572
 573
 574 RMR_ERR_NOENDPT
 575
 576   send/call could not find an endpoint based on msg type
 577
 578
 579 RMR_ERR_EMPTY
 580
 581   msg received had no payload; attempt to send an empty
 582   message
 583
 584
 585 RMR_ERR_NOHDR
 586
 587   message didn't contain a valid header
 588
 589
 590 RMR_ERR_SENDFAILED
 591
 592   send failed; errno may contain the transport provider
 593   reason
 594
 595
 596 RMR_ERR_CALLFAILED
 597
 598   unable to send the message for a call function; errno may
 599   contain the transport provider reason
 600
 601
 602 RMR_ERR_NOWHOPEN
 603
 604   no wormholes are open
 605
 606
 607 RMR_ERR_WHID
 608
 609   the wormhole id provided was invalid
 610
 611
 612 RMR_ERR_OVERFLOW
 613
 614   operation would have busted through a buffer/field size
 615
 616
 617 RMR_ERR_RETRY
 618
 619   request (send/call/rts) failed, but caller should retry
 620   (EAGAIN for wrappers)
 621
 622
 623 RMR_ERR_RCVFAILED
 624
 625   receive failed (hard error)
 626
 627
 628 RMR_ERR_TIMEOUT
 629
 630   response message not received in a reasonable amount of
 631   time
 632
 633
 634 RMR_ERR_UNSET
 635
 636   the message hasn't been populated with a transport buffer
 637
 638
 639 RMR_ERR_TRUNC
 640
 641   length in the received buffer is longer than the size of
 642   the allocated payload, received message likely truncated
 643   (length set by sender could be wrong, but we can't know
 644   that)
 645
 646
 647 RMR_ERR_INITFAILED
 648
 649   initialisation of something (probably message) failed
 650
 651
 652 RMR_ERR_NOTSUPP
 653
 654   the request is not supported, or RMR was not initialised
 655   for the request
 656
 657
 658 Depending on the underlying transport mechanism, and the
 659 nature of the call that RMR attempted, the system errno value
 660 might reflect additional detail about the failure.
 661 Applications should **not** rely on errno as some transport
 662 mechanisms do not set it with any consistency.
 663
 664 Configuration and Control
 665 ============================================================================================
 666
 667 With the assumption that most RMR based applications will be
 668 executed in a containerised environment, there are some
 669 underlying mechanics which the developer may need to know in
 670 order to properly provide a configuration specification to
 671 the container management system. The following paragraphs
 672 briefly discuss these.
 673
 674
 675 TCP Ports
 676 --------------------------------------------------------------------------------------------
 677
 678 RMR requires two (2) TCP listen ports: one for general
 679 application-to-application communications and one for
 680 route-table updates. The general communication port is
 681 specified by the application at the time RMR is initialised.
 682 The port used to listen for route table updates is likely to
 683 be a constant port shared by all applications provided they
 684 are running in separate containers. To that end, the port
 685 number defaults to 4561, but can be configured with an
 686 environment variable (see later paragraph in this section).
 687
 688 Host Names
 689 --------------------------------------------------------------------------------------------
 690
 691 RMR is typically host name agnostic. Route table entries may
 692 contain endpoints defined either by host name or IP address.
 693 In the container world the concept of a *service name* might
 694 exist, and likely is different than a host name. RMR's only
 695 requirement with respect to host names is that a name used on
 696 a route table entry must be resolvable via the gethostbyname
 697 system call.
 698
 699 Environment Variables
 700 --------------------------------------------------------------------------------------------
 701
 702 Several environment variables are recognised by RMR which, in
 703 general, are used to define interfaces and listen ports (e.g.
 704 the route table update listen port), or debugging
 705 information. Generally this information is system controlled
 706 and thus RMR expects this information to be defined in the
 707 environment rather than provided by the application. The
 708 following is a list of the environment variables which RMR
 709 recognises:
 710
 711
 712
 713 RMR_BIND_IF
 714
 715   The interface to bind to listen ports to. If not defined
 716   0.0.0.0 (all interfaces) is assumed.
 717
 718
 719 RMR_RTG_SVC
 720
 721   The port RMR will listen on for route manager connections.
 722   If not defined 4561 is used.
 723
 724
 725 RMR_SEED_RT
 726
 727   Where RMR expects to find the name of the seed (static)
 728   route table. If not defined no static table is read.
 729
 730
 731 RMR_RTG_ISRAW
 732
 733   If the value set to 0, RMR expects the route table manager
 734   messages to be messages with and RMR header. If this is
 735   not defined messages are assumed to be "raw" (without an
 736   RMR header.
 737
 738
 739 RMR_VCTL_FILE
 740
 741   Provides a file which is used to set the verbose level of
 742   the route table collection thread. The first line of the
 743   file is read and expected to contain an integer value to
 744   set the verbose level. The value may be changed at any
 745   time and the route table thread will adjust accordingly.
 746
 747
 748 RMR_SRC_NAMEONLY
 749
 750   If the value of this variable is greater than 0, RMR will
 751   not permit the IP address to be sent as the message
 752   source. Only the host name will be sent as the source in
 753   the message header.
 754
 755
 756
 757 Logging
 758 --------------------------------------------------------------------------------------------
 759
 760 RMR does **not** use any logging libraries; any error or
 761 warning messages are written to standard error. RMR messages
 762 are written with one of three prefix strings:
 763
 764
 765
 766 [CRI]
 767
 768   The event is of a critical nature and it is unlikely that
 769   RMR will continue to operate correctly if at all. It is
 770   almost certain that immediate action will be needed to
 771   resolve the issue.
 772
 773
 774 [ERR]
 775
 776   The event is not expected and RMR is not able to handle
 777   it. There is a small chance that continued operation will
 778   be negatively impacted. Eventual action to diagnose and
 779   correct the issue will be necessary.
 780
 781
 782 [WRN]
 783
 784   The event was not expected by RMR, but can be worked
 785   round. Normal operation will continue, but it is
 786   recommended that the cause of the problem be investigated.
 787
 788
 789
 790 _____________________________________________________________
 791
 792 [1] It is entirely possible to design a routing table, and
 793 application group, such that the same message type is is
 794 left unchanged and the message is forwarded by an
 795 application after updating the payload. This type of
 796 behaviour is often referred to as service chaining, and can
 797 be done without any "knowledge" by an application with
 798 respect to where the message goes next. Service chaining is
 799 supported by RMR in as much as it allows the message to be
 800 resent, but the actual complexities of designing and
 801 implementing service chaining lie with the route table
 802 generator process.
 803
 804
 805
 806 [2] There is additional overhead to support multi-threaded
 807 call as a special listener thread must be used in order to
 808 deliver responses to the proper application thread.
 809
 810
 811
 812
 813
 814
 815 Appendix A -- Quick Reference
 816 ============================================================================================
 817
 818 Please  refer  to  the RMR manual pages on the Read the Docs
 819 site
 820
 821 https://docs.o-ran-sc.org/projects/o-ran-sc-ric-plt-lib-rmr/en/latest/index.html
 822
 823
 824 Appendix B -- Message Buffer Details
 825 ============================================================================================
 826
 827 The RMR message buffer is a C structure which is exposed  in
 828 the  rmr.h  header  file.  It  is  used  to manage a message
 829 received from a peer endpoint, or a message  that  is  being
 830 sent  to  a  peer.  Fields include payload length, amount of
 831 payload actually  used,  status,  and  a  reference  to  the
 832 payload.  There are also fields which the application should
 833 ignore, and could be hidden in the header file, but we chose
 834 not  to.  These fields include a reference to the RMR header
 835 information,  and  to  the  underlying  transport  mechanism
 836 message  struct  which may or may not be the same as the RMR
 837 header reference.
 838
 839 The Structure
 840 --------------------------------------------------------------------------------------------
 841
 842 The following is the C structure. Readers are  cautioned  to
 843 examine the rmr.h header file directly; the information here
 844 may be out of date (old document in some cache), and thus it
 845 may be incorrect.
 846
 847
 848 ::
 849
 850  typedef struct {
 851      int    state;            // state of processing
 852      int    mtype;            // message type
 853      int    len;              // length of data in the payload (send or received)
 854      unsigned char* payload;  // transported data
 855      unsigned char* xaction;  // pointer to fixed length transaction id bytes
 856      int    sub_id;           // subscription id
 857      int    tp_state;         // transport state (errno)
 858                               // these things are off limits to the user application
 859      void*    tp_buf;         // underlying transport allocated pointer (e.g. nng message)
 860      void*    header;         // internal message header (whole buffer: header+payload)
 861      unsigned char* id;       // if we need an ID in the message separate from the xaction id
 862      int      flags;          // various MFL_ (private) flags as needed
 863      int      alloc_len;      // the length of the allocated space (hdr+payload)
 864      void*    ring;           // ring this buffer should be queued back to
 865      int      rts_fd;         // SI fd for return to sender
 866      int      cookie;         // cookie to detect user misuse of free'd msg
 867  } rmr_mbuf_t;
 868
 869
 870
 871
 872 State vs Transport State
 873 --------------------------------------------------------------------------------------------
 874
 875 The  state  field reflects the state at the time the message
 876 buffer is returned to the calling application.  For  a  send
 877 operation, if the state is not RMR_OK then the message buffer
 878 references the payload that could not be sent, and when  the
 879 state is RMR_OK the buffer references a *fresh* payload that
 880 the application may fill in.
 881
 882 When the state is not RMR_OK, C programmes may  examine  the
 883 global  errno  value which RMR will have left set, if it was
 884 set, by the underlying transport mechanism. In  some  cases,
 885 wrapper modules are not able to directly access the C-library
 886 errno value, and to assist  with  possible  transport  error
 887 details,  the  send and receive operations populate tp_state
 888 with the value of errno.
 889
 890 Regardless of whether  the  application  makes  use  of  the
 891 tp_state,  or  the  errno value, it should be noted that the
 892 underlying transport mechanism may not actually  update  the
 893 errno  value;  in  other words: it might not be accurate. In
 894 addition, RMR populates the tp_state value  in  the  message
 895 buffer **only** when the state is not RMR_OK.
 896
 897 Field References
 898 --------------------------------------------------------------------------------------------
 899
 900 The  transaction  field  was exposed in the first version of
 901 RMR, and in hindsight this shouldn't have been done.  Rather
 902 than  break  any  existing  code the reference was left, but
 903 additional fields such as  trace  data,  were  not  directly
 904 exposed  to  the  application.  The application developer is
 905 strongly encouraged to use the functions which get  and  set
 906 the  transaction  ID rather than using the pointer directly;
 907 any data overruns will not be detected if the  reference  is
 908 used directly.
 909
 910 In contrast, the payload reference should be used directly by
 911 the application  in  the  interest  of  speed  and  ease  of
 912 programming.  The same care to prevent writing more bytes to
 913 the payload buffer than it can hold must  be  taken  by  the
 914 application.  By the nature of the allocation of the payload
 915 in transport space, RMR is unable to add guard bytes  and/or
 916 test for data overrun.
 917
 918 Actual Transmission
 919 --------------------------------------------------------------------------------------------
 920
 921 When RMR sends the application's message, the message buffer
 922 is **not** transmitted. The transport buffer (tp_buf)  which
 923 contains  the RMR header and application payload is the only
 924 set of bytes which are transmitted. While it may seem to the
 925 caller  like the function is returning a new message buffer,
 926 the same struct is reused and only a new transport buffer is
 927 allocated.  The intent is to keep the alloc/free cycles to a
 928 minimum.
 929
 930
 931 Appendix C -- Glossary
 932 ============================================================================================
 933
 934 Many terms in networking can be  interpreted  with  multiple
 935 meanings,  and  several  terms used in this document are RMR
 936 specific. The following definitions are the meanings of terms
 937 used  within  this  document  and  should help the reader to
 938 understand the intent of meaning.
 939
 940
 941
 942 application
 943
 944   A programme which uses RMR to send and/or receive messages
 945   to/from another RMR based application.
 946
 947
 948 Critical error
 949
 950   An  error  that  RMR  has  encountered which will prevent
 951   further successful processing  by  RMR.  Critical  errors
 952   usually indicate that the application should abort.
 953
 954
 955 Endpoint
 956
 957   An RMR based application that is defined as being capable
 958   of receiving one or more types of messages (as defined by
 959   a *message key.*)
 960
 961
 962 Environment variable
 963
 964   A   key/value  pair  which  is  set  externally  to  the
 965   application, but which is available  to  the  application
 966   (and referenced libraries) through the getenv system call.
 967   Environment variables are the main method of communicating
 968   information such as port numbers to RMR.
 969
 970
 971 Error
 972
 973   An  abnormal condition that RMR has encountered, but will
 974   not affect the overall processing by RMR, but may  impact
 975   certain aspects such as the ability to communicate with a
 976   specific   endpoint.   Errors  generally  indicate  that
 977   something, usually external to RMR, must be addressed.
 978
 979
 980 Host name
 981
 982   The  name  of  the  host as returned by the gethostbyname
 983   system call. In a containerised environment this might be
 984   the  container  or  service  name  depending  on  how the
 985   container is started. From RMR's point of  view,  a  host
 986   name can be used to resolve an *endpoint* definition in a
 987   *route* table.)
 988
 989
 990 IP
 991
 992   Internet protocol. A low level transmission protocol which
 993   governs  the  transmission  of  datagrams  across network
 994   boundaries.
 995
 996
 997 Listen socket
 998
 999   A *TCP* socket used to await incoming connection requests.
1000   Listen sockets are defined by an interface and port number
1001   combination where the  port  number  is  unique  for  the
1002   interface.
1003
1004
1005 Message
1006
1007   A  series  of  bytes  transmitted from the application to
1008   another RMR based application. A message is comprised  of
1009   RMR  specific  data  (a  header), and application data (a
1010   payload).
1011
1012
1013 Message buffer
1014
1015   A data structure used to describe a message which is to be
1016   sent or has been received. The message buffer includes the
1017   payload length, message type, message source,  and  other
1018   information.
1019
1020
1021 Messgae type
1022
1023   A  signed  integer (0-32000) which identifies the type of
1024   message  being  transmitted,  and  is  one  of  the  two
1025   components of a *routing key.* See *Subscription ID.*
1026
1027
1028 Payload
1029
1030   The  portion of a message which holds the user data to be
1031   transmitted to the remote *endpoint.* The payload contents
1032   are completely application defined.
1033
1034
1035 RMR context
1036
1037   A  set  of information which defines the current state of
1038   the underlying transport connections that RMR is managing.
1039   The application will be give a context reference (pointer)
1040   that is supplied to  most  RMR  functions  as  the  first
1041   parameter.
1042
1043
1044 Round robin
1045
1046   The  method  of  selecting an *endpoint* from a list such
1047   that all *endpoints* are selected before starting at  the
1048   head of the list.
1049
1050
1051 Route table
1052
1053   A series of "rules" which define the possible *endpoints*
1054   for each *message key.*
1055
1056
1057 Route table manager
1058
1059   An application responsible for building a  *route  table*
1060   and  then  distributing  it  to  all applicable RMR based
1061   applications.
1062
1063
1064 Routing
1065
1066   The process of selecting an *endpoint* which will be  the
1067   recipient of a message.
1068
1069
1070 Routing key
1071
1072   A  combination  of  *message  type* and *subscription ID*
1073   which RMR uses to select the destination *endpoint*  when
1074   sending a message.
1075
1076
1077 Source
1078
1079   The sender of a message.
1080
1081
1082 Subscription ID
1083
1084   A  signed  integer  value  (0-32000) which identifies the
1085   subscription characteristic of a message. It is  used  in
1086   conjunction  with  the  *message  type*  to determine the
1087   *routing key.*
1088
1089
1090 Target
1091
1092   The *endpoint* selected to receive a message.
1093
1094
1095 TCP
1096
1097   Transmission Control Protocol. A connection based internet
1098   protocol    which    provides   for   lossless   packet
1099   transportation, usually over IP.
1100
1101
1102 Thread
1103
1104   Also called a *process thread, or  pthread.*  This  is  a
1105   lightweight  process  which executes in concurrently with
1106   the application and shares the same  address  space.  RMR
1107   uses  threads  to  manage  asynchronous functions such as
1108   route table updates. &Term An  optional  portion  of  the
1109   message buffer that the application may populate with data
1110   that allows for tracing the progress of the transaction or
1111   application  activity across components. RMR makes no use
1112   of this data.
1113
1114
1115 Transaction ID
1116
1117   A fixed number of bytes in the *message* buffer) which the
1118   application  may populate with information related to the
1119   transaction. RMR makes use  of  the  transaction  ID  for
1120   matching response messages with the &c function is used to
1121   send a message.
1122
1123
1124 Transient failure
1125
1126   An error state that is believed to be short lived and that
1127   the  operation,  if  retried by the application, might be
1128   successful. C programmers will recognise this as EAGAIN.
1129
1130
1131 Warning
1132
1133   A warning occurs when RMR has encountered something  that
1134   it believes isn't correct, but has a defined work round.
1135
1136
1137 Wormhole
1138
1139   A  direct  connection  managed  by  RMR  between the user
1140   application and a remote, RMR based, application.
1141
1142
1143
1144 Appendix D -- Code Examples
1145 ============================================================================================
1146
1147 The following snippet of code illustrate some of  the  basic
1148 operation  of  the RMR library. Please refer to the examples
1149 and test directories in the RMR repository for complete  RMR
1150 based programmes.
1151
1152 Sender Sample
1153 --------------------------------------------------------------------------------------------
1154
1155 The following code segment shows how a message buffer can be
1156 allocated, populated, and sent. The snippet also illustrates
1157 how  the  result  from the function is used to send the next
1158 message. It does not illustrate error and/or retry handling.
1159
1160
1161 ::
1162
1163  mrc = rmr_init( listen_port, MAX_BUF_SZ, RMRFL_NOFLAGS );
1164  rmr_set_stimeout( mrc, rmr_retries );
1165  while( ! rmr_ready( mrc ) ) {
1166      sleep( 1 );
1167  }
1168  sbuf = rmr_alloc_msg( mrc, 256 );   // 1st send buffer
1169  while( TRUE ) {
1170      sbuf->len = gen_status( (status_msg *) sbuf->payload );
1171      sbuf->mtype = STATUS_MSG;
1172      sbuf->sub_id = RMR_VOID_SUBID;     // subscription not used
1173      sbuf = rmr_send_msg( mrc, sbuf );
1174      sleep( delay_sec );
1175  }
1176  rmr_close( mrc );
1177
1178
1179
1180 Receiver Sample
1181 --------------------------------------------------------------------------------------------
1182
1183 The receiver code is even simpler than the sender code as it
1184 does  not  need  to  wait  for a route table to arrive (only
1185 senders need to do that), nor does it need  to  allocate  an
1186 initial  buffer.  The  example  assumes  that  the sender is
1187 transmitting a zero terminated string as the payload.
1188
1189
1190 ::
1191
1192  rmr_mbuf_t* rbuf = NULL;
1193  void* mrc = rmr_init( listen_port, MAX_BUF_SZ, RMRFL_NOFLAGS );
1194  while( TRUE ) {
1195      rbuf = rmr_rcv_msg( mrc, rbuf );    // reuse buffer on all but first loop
1196      if( rbuf == NULL || rbuf->state != RMR_OK ) {
1197          break;
1198      }
1199      fprintf( stdout, "mtype=%d sid=%d pay=%s\\n",
1200          rbuf->mtype, rbuf->sub_id, rbuf->payload );
1201      sleep( delay_sec );
1202  }
1203  fprintf( stderr, "receive error\\n" );
1204  rmr_close( mrc );
1205
1206
1207
1208 Receive and Send Sample
1209 --------------------------------------------------------------------------------------------
1210
1211 The following code snippet receives messages and responds to
1212 the  sender if the message type is odd. The code illustrates
1213 how the received message may be used to return a message  to
1214 the source. Variable type definitions are omitted for clarity
1215 and should be obvious.
1216
1217 It should also be noted that things like  the  message  type
1218 which  id returned to the sender (99) is a random value that
1219 these applications would have agreed on in  advance  and  is
1220 **not** an RMR definition.
1221
1222
1223 ::
1224
1225  mrc = rmr_init( listen_port, MAX_BUF_SZ, RMRFL_NOFLAGS );
1226  rmr_set_stimeout( mrc, 1 );        // allow RMR to retry failed sends for ~1ms
1227  while( ! rmr_ready( mrc ) ) {        // we send, therefore we need a route table
1228      sleep( 1 );
1229  }
1230  mbuf = NULL;                        // ensure our buffer pointer is nil for 1st call
1231  while( TRUE ) {
1232      mbuf = rmr_rcv_msg( mrc, mbuf );        // wait for message
1233      if( mbuf == NULL || mbuf->state != RMR_OK ) {
1234          break;
1235      }
1236      if( mbuf->mtype % 2 ) {                // respond to odd message types
1237          plen = rmr_payload_size( mbuf );        // max size
1238                                                  // reset necessary fields in msg
1239          mbuf->mtype = 99;                       // response type
1240          mbuf->sub_id = RMR_VOID_SUBID;          // we turn subid off
1241          mbuf->len = snprintf( mbuf->payload, plen, "pong: %s", get_info() );
1242          mbuf = rmr_rts_msg( mrc, mbuf );        // return to sender
1243          if( mbuf == NULL || mbuf->state != RMR_OK ) {
1244              fprintf( stderr, "return to sender failed\\n" );
1245          }
1246      }
1247  }
1248  fprintf( stderr, "abort: receive failure\\n" );
1249  rmr_close( mrc );
1250
1251
1252