<OpenSplice> <Domain> <Name>LOGROTATE</Name> <BuiltinTopics enabled="true" logfile="builtin.log"/> </Domain> </OpenSplice>
Up to this release, in a shared memory deployment mode, each process that used OpenSplice would attempt to clean up its shared resources in case it crashed. If for some reason the clean up failed, some of the shared resources could leak or in some edge cases the process could end up in a deadlock situation while cleaning up. In the V6.4.1 release, a new mechanism has been introduced that allows asynchronous clean-up of shared resources by the splice-daemon. The splice-daemon now detects when crashed processes leave shared resources behind and will try to cleans up if it is certain these resources have been left in a consistent state. In case (part of) the resources have not been left in a consistent state (this happens when a process crashes while modifying shared resources), it will terminate the entire middleware and return an error code.
Together with the introduction of this garbage collection mechanism a new configuration element //OpenSplice/Domain/InProcessExceptionHandling is introduced. This element controls whether processes will try to release its shared resources upon a crash or not. By default processes will try to release resources, this is useful in case of application crashes which are unrelated to DDS operations, it will give any ongoing DDS operations time to become consistent before leaving the system and avoid any unnecessary system shutdown. In case an accurate core file is required this configuration element can be set to TRUE, meaning that processes will terminate immediately and leave all shared resources to be released by the service.
For now, this feature only works on POSIX-compliant operating systems, but in the future this mechanism will be implemented for other operating systems as well.
Please note: Due to a Java IDL bug fix (OSPL-4333) introduced at 6.4.0p5, any customer using the Java API and upgrading to 6.4.1 from version 6.4.0 to 6.4.0p4 inclusive will need to recompile their IDL using idlpp. Application code does not need to be recompiled.
Given the below IDL definition:
/* @file: Foo.idl */
struct Bar {/
long key;/
};/
#pragma keylist Bar key/
At OSPL 6.1 the required CORBA-Java DCPS code-generation steps were:
1 / $JACORB_HOME/bin/idl Foo.idl/
This produced files:/
BarHelper.java/
BarHolder.java/
Bar.java/
2/ $OSPL_HOME/bin/idlpp -C -l java Foo.idl/
This produced files:/
BarDataReaderHelper.java/
BarDataReaderHolder.java/
BarDataReaderImpl.java/
BarDataReader.java/
BarDataReaderOperations.java/
BarDataReaderViewHelper.java/
BarDataReaderViewHolder.java/
BarDataReaderViewImpl.java/
BarDataReaderView.java/
BarDataReaderViewOperations.java/
BarDataWriterHelper.java/
BarDataWriterHolder.java/
BarDataWriterImpl.java/
BarDataWriter.java/
BarDataWriterOperations.java/
BarMetaHolder.java/
BarSeqHolder.java/
BarTypeSupportHelper.java/
BarTypeSupportHolder.java/
BarTypeSupport.java/
BarTypeSupportOperations.java/
Now at OSPL 6.2 the required steps are:
1/ $JACORB_HOME/bin/idl Foo.idl
This step is the same as 6.1 and produces the same files.
2/ $OSPL_HOME/bin/idlpp -C -l java Foo.idl
This step is the same command as 6.1 however the files produced are now:
BarDataReaderImpl.java
BarDataReaderViewImpl.java
BarDataWriterImpl.java
BarMetaHolder.java
BarTypeSupport.java
FooDcps.idl
3/ $JACORB_HOME/bin/idl -I$OSPL_HOME/etc/idl FooDcps.idl
This step is new at 6.2 and processes the *Dcps.idl file now produced by step 2 with the ORB IDL compiler. This produces the below files, per the IDL to Java mapping:
BarDataReaderHelper.java *
BarDataReaderHolder.java *
BarDataReader.java *
_BarDataReaderLocalBase.java
BarDataReaderLocalTie.java
BarDataReaderOperations.java *
BarDataReaderViewHelper.java *
BarDataReaderViewHolder.java *
BarDataReaderView.java *
_BarDataReaderViewLocalBase.java
BarDataReaderViewLocalTie.java
BarDataReaderViewOperations.java *
BarDataWriterHelper.java *
BarDataWriterHolder.java *
BarDataWriter.java *
_BarDataWriterLocalBase.java
BarDataWriterLocalTie.java
BarDataWriterOperations.java *
BarSeqHelper.java
BarSeqHolder.java *
BarTypeSupportInterfaceHelper.java
BarTypeSupportInterfaceHolder.java
BarTypeSupportInterface.java
_BarTypeSupportInterfaceLocalBase.java
BarTypeSupportInterfaceLocalTie.java
BarTypeSupportInterfaceOperations.java
Files marked with an * were previously produced by step 2 (i.e. idlpp); note the file BarTypeSupportOperations.java is no longer produced by any step.
Report ID. | Description |
---|---|
OSPL-14865 / 00021858 |
In the isocpp api the generic InstanceHandle constructor should be explicit.. The issue is that the implicit constructor of InstanceHandle template allows any type to be implicitly converted to an InstanceHandle. By specifying the constructor as explicit solves this issue. See isocpp specification issue. Solution: The constructor specified in dds::core::TInstanceHandle is defined as explicit. |
OSPL-14847 / 00021846 |
When DDS security is used the tuner may not be able to create a reader for topics allowed by security. For the tuner to create a reader or writer for a certain topic it needs the associated topic and type definition. Topic discovery would provide that information. However when DDS security is used the topic discovery information may not be distributed. It appears that topic information is only sent when the permission file specifies the all wildcard "*" for the partition part of an allow rule. At the moment the topic is created the ddsi service will try to send topic discovery information and will ask the access control plugin if that is allowed. However the access control plugin will reject that request because the partition related to a reader or writer is not yet known. Solution: When DDS security is enabled topic discovery information is sent when access control permits the creation of a reader or writer. In this case the associated topic information will be distributed. |
OSPL-14825 / 00021833 |
[NodeJS] IDL containing a union field of type sequence did not work. If an IDL type contained a Union object with a field of type Sequence, then the NodeJS API would either fail with an obscure error or write data that did not contain the sequence data. Solution: NodeJS now supports sequence fields within a union. |
OSPL-14787 / 00021808 |
Transient-local alignment may be slow in the case of large fragmented user samples. When retransmitting a fragmented message, ddsi will first send the first fragment of the sample to provide better flow control of large samples. When message loss occurs, this will prevent complete samples having to be retransmitted when fragment are lost. When using this mode, ddsi will handle one sample at a time and proceed with the next sample after the first sample has been completely acknowledged. This could cause the alignment of a large amount of transient-local data to become slow which is related to the roundtrip latency. To accelerate the alignment a number of samples could be partly retransmitted. Solution: When a number of fragmented samples are scheduled for retransmission then retransmit fragments of a number of these samples before waiting for a nackfrag message. |
OSPL-14772 / 00021794 |
The spliced daemon may deadlock on termination when a service did not terminate in time and the spliced daemon forcefully terminates the service. When a process terminates or crashes, the spliced daemon will try to cleanup the shared memory resources that the process may have left. When a process crashes when accessing shared memory, which should not occur, the spliced daemon will not try to cleanup the shared memory resources and will terminate because the state of the shared memory could be compromised. However in this case, the spliced daemon receives a termination signal and will start terminating and shutting down the services. It appears that the ddsi and the durability services do not terminate fast enough, which causes that the spliced daemon to send a kill -9 to these services. Although the spliced daemon is in the terminating phase it still detects that the durability process has terminated and it detects that the durability process was accessing shared memory when it received the kill -9 signal. Because the spliced daemon is already in the terminating state, it does not check if the shared memory is compromised and starts cleaning up the shared memory, which causes the deadlock because the durability service was still holding a lock. Solution: When the splice daemon during termination has to forcefully terminate a non-responding service, it directly terminates without performing a cleanup action. |
OSPL-14770 / 00021791 |
The durability service may deadlock when resolving a connect conflict with nodes having a role defined. When detecting a fellow, the durability creates a connect conflict. A connect conflict can be combined with an existing connect conflict from a different fellow which enables a connect conflict to be resolved for all fellows in one alignment action. However connect conflicts from fellows which have different roles cannot be combined. The role of a fellow becomes known when a namespace message is received. Initially, connect conflict of fellows are combined but when the role information becomes available the connect conflicts have to be split again. The split of the connect conflicts causes a deadlock because the lock of the conflict was not released when adding the split conflict to the conflict administration. Solution: When a connect conflict for a particular fellow is removed from a combined connect conflict because the role of this fellow does not match, then release the conflict administration lock before adding the new conflict. |
OSPL-14763 |
DLite slow processing of incoming alignment data. The main loop of DLite periodically waits in a waitset for incoming protocol messages. If alignment beads are received the waitset will unblock and DLite will process the incoming alignment data. However, every 100 messages DLite will pause processing of alignment and return to the main loop to verify if it needs to do some housekeeping. After that it should continue processing the remaining beads. However, it continues by calling the waitset wait again and this is incorrect in the case where no new data is received and remaining unprocessed data still exists. If no new data is received, the data available status is not set because it was reset in the previous cycle, so the wait will block until it receives new data or a timeout occurs (1 second), after which it will process the next 100 beads. This means that in a worst case scenario the processing of received alignment data will add a 1 second delay (timeout) for every 100 beads. Solution: The solution is to check if unprocessed data exists before entering the waitset wait. The waitset wait shall be skipped in the case where unprocessed data exists. |
OSPL-14759 / 00021786 |
Using ordered access with group scope could result in a segmentation fault. To provide ordered access with group scope a subscriber related resource is shared by the readers of this subscriber. Not all operations on this shared resource are properly locked which could cause that more readers could manipulate this shared resource at the same time causing it to become corrupt. Solution: All concurrent operations on the shared resource which is used to provided coherent access on group scope are properly locked. |
OSPL-14749 / 00021776 |
An sample from a reliable writer can get lost when the sample is written during startup of OpenSplice. When an application writer writes a sample the sample is put in a queue. The samples on this queue are then handled by the ddsi service and forwarded on the network. To handle a sample the ddsi service needs the information about the application writer. For that purpose the ddsi service listens to the internally generated builtin topics that are created when the application writer is created. When the ddsi service reads a sample from the internal queue it checks if it already knows about the application writer, which means that it has received the internal generated builtin topic associated with the application writer. The ddsi service will drop the sample when it has not yet received the corresponding builtin topic. Normally this could not happen because the builtin topic is created when the application writer is created and thus before a sample can be written. However during startup of OpenSplice, it could occur that the ddsi service is not yet ready to receive the internally generated builtin topics and then the resend manager will be responsible to provide the builtin topics to the ddsi service at a later time. In that case there is a small chance that the ddsi service retrieves an application sample from the queue before it has received the corresponding builtin topic and then will drop the sample. Solution: The spliced daemon will set the state to operational after the networking services have been initialized and are able to process the builtin topics. This will resolve the issue because the creation of a domain participant will wait until the state becomes operational. |
OSPL-14684 / 00021591 |
The functions ignore_participant, ignore_publication and ignore_subscription do not work correctly. The functions ignore_participant, ignore_publication and ignore_subscription do not work as intended. The idea is that you pass the instance handle of the entity you want to ignore (you can obtain this instance handle from the builtin topics, or from the function get_instance_handle on the Entity class), and then all data originating from that Entity will be discarded. However, something went wrong in the translation of the instance handle into the intended target, causing it not to be located and therefore not to be ignored. Solution: The translation from instance handle to intended target has now been corrected. which causes the intended target to be ignored correctly. |
OSPL-14255 / 00021085 |
In certain circumstances BY_RECEPTION_TIMESTAMP topics (including the builtin topics) may go back in time. BY_RECEPTION_TIMESTAMP topics (this includes all builtin topics that are BY_RECEPTION_TIMESTAMP according to the DDS specification) would always append newly arriving samples to the end of their corresponding instance queue. This would allow them to go back in time if samples would ever be receiver out-of-order. One particular scenario where this could wreak havoc is when builtin topics that are aligned using either Durability/DLite (this for example the case when using native networking, where the builtin topics do not get aligned as part of the networking protocol) get disposed before the transient snapshot (in which they are still included) arrives. So in a case like that, you first get a DISPOSE message from a DCPSPublication through the live networking path, followed by the sample preceding it from the transient snapshot, which would result in the DCPSPublication representing a particular Writer ending in the ALIVE state instead of in the DISPOSE state. This could cause Readers to assume there still is at least one ALIVE Writer while in fact there is not, causing their data to stay ALIVE even when this is incorrect. Also mechanisms like the synchronous reliability do not work correctly in scenarios like that: if a DCPSSubscription represents a synchronous Reader then the writing side can be fooled into believing there is still a Reader that needs to acknowledge its data, when in fact this Reader has already left the system. The writing side will then block waiting for an ack that will never be sent, effectively blocking the Writer application indefinitely. Solution: BY_RECEPTION_TIMESTAMP topics will no longer be allowed to go back in time for data originating from the same DataWriter. Each individual instance with samples originating from the same source becomes effectively "eventually consistent". |
OSPL-14233 |
For the RT networking service, allow the use of the loopback interface. Specifying the loopback interface in the configuration of the networking service does not work. When selecting the network interface to be used, the networking checks if the interface is multicast capable which is normally not the case for the loopback interface. Therefore the networking service will ignore the configured loopback interface and select the first multicast capable interface. Solution: When the networking configuration specifies to use the loopback interface, it will accept this interface without checking if it is multicast capable. When using the loopback interface to communicate between networking instances on the same node, it is necessary for the GobalPartition address to be a multicast address and that EnableMulticastLoopback is enabled (which is the default). |
Report ID. | Description |
---|---|
OSPL-14557 / 00021626 |
A memory leak may occur when using a dataview. When performing a read or take on a dataview the read operation walks through the instance table associated with the view and temporarily increments the reference count of that instance. After processing this instance the reference count is not decremented which cause the memory leak. Solution: Release a dataview instance after it has been accessed by a read or take operation because the read or take increase the reference count of the instance. |
OSPL-14621 / 00021673 |
Reader instance of a reader which uses a content filter may leak. When a content filter is used samples that do not match with the filter will not be injected in the reader. However when a sample is received and before the filter expression is evaluated the reader instance that corresponds with this sample is searched for in the reader administration. When that instance is not found a new reader instance is created and inserted in the reader administration. Then when the sample does not pass the content filter the newly created empty instance will leak. Solution: When a content filter is used check before creating a reader instance if the received sample passes the content filter. |
OSPL-14683 / 00021723 |
A take/read_next_instance on a dataview may incorrectly fail. The take/read_next_instance on a view will loop through the view instances until it finds an instance and sample that passes the provided instance- and sample-mask. However before checking if a sample matches with the provided instance mask the state of the sample is already changed. For example it has been set to read before the check on the instance mask is performed which indicates that the sample does not match. Solution: When performing a take/read_next_instance operation on a view first check if the instance passes the provided mask. |
OSPL-14694 / 00021729 |
Rank values and GenerationCount values of SampleInfo object in isocpp2 are always set to 0 The attributes in the rank() object and generation_count() object in the SampleInfo of the ISOCPP2 API were always set to 0, even in cases where they should have been > 0. This was caused by the Reader modifying a copy of the object instead of its original value. Solution: Instead of obtaining a copy of the rank() object or generation_count() object and modifying its attributes, we now instantiate a new rank() or generation_count() object and set that directly into the SampleInfo. |
Report ID. | Description |
---|---|
OSPL-14714 / 00021756 |
Ddsi discovery of remote entities may fail after an asymmetrical disconnect. The ddsi discovery protocol of readers and writers is using transient-local semantics. When an asymmetrical disconnect occur caused by massive packet loss it may occur that a transient-local reader does not receive all the data of the corresponding transient-local writer because the writer did not noticed the disconnect and assumes that all readers have received all the data and does not sent a heartbeat. However the asymmetrical disconnected reader does not send an acknack to retrigger the transient-local realignment because a heartbeat from the writers was already received before the asymmetrical disconnect occurred. Solution: A reader keeps asking for data (send acknack) when it detects that is has not received all the data with a configurable interval which is 1s by default. |
OSPL-14407 / 00021529 |
A master conflict may cause two successive alignment to occur. When a master conflict occurs and an other node becomes master. At a node that is an alignee for the correspond namespace the resolution of the master conflict will result in alignment requests being sent to the new master. The new master will issue alignment request to other present aligners of the namespace because it may not have the complete state when it became the new master. After the new master has received all the relevant data from the other aligners it will raise the state of the namespace. This causes a native state conflict at all the alignees which will then again request alignment data from the new master. When the master would answer the first alignment request after it has received the alignment data from the other aligners it would not be necessary for the alignees to perform an additional alignment request when detecting the native state conflict. Solution: When a master conflict results in a new master the new master will delay answering alignment request until it has received the alignment data from the other aligners. When resolving a native state conflict a node will check if it already has received alignment data from the new master that was collected after the new master had raised the state of the corresponding namespace and when that is the case consider the native state conflict solved. |
OSPL-14606 / 00021668 |
Durability incorrectly discards a native state conflict when receiving a namespace's message out of order. A durability protocol message could be received out-of-order. In this case, an old namespace message gets processed after a namespace message from the master node, indicating a state change which would normally generate a native state conflict. However, the processing of the old namespace message causes the namespace state to be reset, which causes the native state conflict to be discarded. Solution: The durability service discards messages which are older than the last handled message. |
OSPL-14610 / 00021670 |
Disposed instance my be revived as result of a durability alignment. When an instance is disposed and unregistered the take operation may remove the instance from the reader cache. A retention period can be configured to keep the instance present for some time after the take operation. The reader instance maintains some information about the last sample that was taken. This to prevent that old samples may be received again. However when a dispose message is received followed by an unregistration message the unregistration message will purge the dispose message from the reader cache without recording the write time of the dispose. In this case approximately at the same time a durability alignment was being performed and the alignment data did not contain the dispose and unregister messages because the alignment snapshot was take just before that these message were written. This may cause that when the durability service injects the data using a replace policy, which purges the reader instance using the time of the snapshot, injects an old sample again in the reader instance making it alive again. Solution: When a dispose message is pushed out a reader instance because of the reception of an unregister message the write time of the dispose message must be recorded in the instance to prevent older samples to update the reader instance. |
OSPL-14633 / 00021683 |
OpenSplice Tuner does not correctly show query details In a long standing regression (nearly a decade), OpenSplice Tuner has not displayed the 'expression', expression 'parameters' or 'instance', 'sample' or 'view' states associated with the query object. This was last known to work in release 6.4.3p17. Solution: The display of these details has been corrected, with the exception of expression parameter values, which cannot be easily retrieved from the internal structures of OpenSplice. |
OSPL-14637 / 00021655 |
The durability service may remain in the incomplete state after handling the last conflict. When the conflict queue becomes empty, the durability service marks the transient store as complete. A race condition could cause a conflict which is processed very fast to incorrectly not set the complete state. Solution: The setting and resetting of the complete state is protected by a mutex lock. |
OSPL-14641 / 00021692 |
A problem in the networking service may prevent messages to be delivered to readers. When detecting a fellow node, the networking service sends a sync message communicating the expected sequence numbers. The receive side of the networking service does not forward received message until this sync message is received. The sending side schedules a timer to resend the sync message when no acknowledgement is received. In this case this timer could also be cancelled when receiving an acknowledgement for a normal data message (expected sequence number). This could cause that the receiving node to never receive the sync message. Solution: The sync message resend timer is only cancelled when receiving a sync acknowledgement message. |
OSPL-14643 / 00021689 |
Wait_for_historical_data on a transient-local reader fails when there exists a transient-local writer which has not sent data. When an application creates a second transient-local reader some time after the first reader was created and there exists a transient-local writer in the system, then the wait_for_historical_data call on the second reader may fail because it determines that the last sequence number received by the writer is still 0. However, when the administration shows that the writer had responded with a heartbeat indicating that no data was available previously, then the second reader should set it's state in sync with the writer. Solution: When a transient-local reader is created which matches with a transient-local writer that has not sent any data yet and a heartbeat of that writer was received before, then set the reader in sync with this writer. |
OSPL-14666 / 00021711 |
Java Exception in OpenSplice Tuner when viewing Query 'Data type' details In OpenSplice Tuner, if you select a Query object, and view its details, and then switch to the 'Data type' tag, a Java exception will occur. Solution: The exception has been corrected, and the 'Data type' tag now correctly displays the data type of the Topic associated with the Query. |
OSPL-14667 / 00021710 |
At startup of a node, the durability service may perform an unnecessary alignment with the master. The durability services communicate the state of the namespaces with each other. In this information, the master node indicates the current state of the namespace. When a client node detects that the state of the namespace has been changed (increased), it will raise a native state conflict which will result in an alignment action with the master. Initially, the client sets its own namespace state to 0. At the moment the durability service becomes complete, it sets the state of the namespace to the state of the master. At startup of the client, the namespace state of the master is probably already higher than one. This may cause the client to incorrectly determine that a native state conflict has occurred and starts an extra alignment with the master. Note that this alignment is unnecessary because the client node already had performed, or will perform an alignment action, which is performed when the durability service starts. Solution: When the durability service receives the first namespace information message from a fellow, it will maintain information about the fellow namespace state. When checking if a native state conflicts, it will compare the initial namespace state received from the master with the current state and only raise a native state conflict when the current master namespace state has been increased. |
OSPL-14671 |
Insertion of historical data which have the synchronize flag set result in unnecessary acknowledgements. When receiving a message which has the synchronize flag set, which indicates that this message is written by a synchronous writer, an acknowledgement is sent. This happens also when the durability service injects historical data. This may cause a storm of these acknowledgements to be sent. This may cause a network overload to occur when several nodes are aligned with this kind of data at the same time. Solution: Acknowledgements are not sent when injecting historical messages which have the synchronize flag set. |
OSPL-14693 / 00021738 |
Network initial synchronization with a newly detected node may fail. When the networking service detects a fellow node, it will send a sync message which indicates the next packet sequence number that will be sent. However older packets may already been received by the fellow node which will then acknowledges these packets. When receiving an acknowledgement, the sending node should populate the resend list associated with the fellow with the packets that are not yet acknowledged. This does not occur when the expected packet sequence number indicated in the sync message has not yet been sent. Solution: When a node receives the first acknowledgement from a fellow node and the acknowledged sequence number is acceptable (not too old), then the resend list is populated with the packets that are sent later but not yet acknowledged. |
OSPL-14713 / 00021525 |
Simulink integration functions idlImportSl and idlImportSLWithIncludePath fail if an output directory is specified. Both idlImportSl and idlImportSLWithIncludePath accept an optional final argument 'outputDirectory'. When a caller provides this parameter, the resulting call to the IDLPP processor will fail, resulting in the function failing. Solution: The order of arguments passed to IDLPP has been changed to prevent the failure. |
OSPL-14725 |
Multidomain support on windows broken when using the ospl tool. When using ospl start to start multiple domains on windows the ospl tool kills the already running domain. Solution: The ospl tool has been fixed so it can now start multiple domains again. |
OSPL-14726 / 00021760 |
Instances may not be disposed after durability master switch. When a conflict is created, the "kernel groups" (transient store) is set to the INCOMPETE state. When all conflicts are resolved and the conflict queue becomes empty, the conflict resolver sets the "kernel groups" back to the COMPLETE state. When the transient store is in the INCOMPLETE state then purging of the transient store will be disabled. When an instance is disposed and unregistered the instance will be purged from the transient store after the service_cleanup_delay has expired. In this case, the service_cleanup_delay is set to 0. Thus the instance may be purged immediately when it gets disposed and unregistered and the transient store is in the COMPLETE state. In this case, it seems that the durability service detects a master conflict which should set the transient store in the INCOMPLETE state. However, just before that event, the conflict resolver has determined that the conflict queue became empty and will then set the transient store to the COMPLETE state. Due to a race condition between the thread that raised the master conflict and the conflict resolver which sets the transient store in the COMPLETE state, it may occur that the transient store remains in the COMPLETE state while the durability service is performing an alignment as a result of the master conflict. When at the master node, an instance is being disposed and unregistered just after the alignment request from the client node and the dispose and unregister are received at the client node, before the durability service at the client node injects the alignment data which does not contain the dispose and unregister. In that case, the instance may be purged from the transient store before the alignment data is injected, which causes that the instance becomes alive again. An existing reader will not notice this but a late joining reader will see the instance incorrectly alive. Solution: The conflict administration is locked when the durability service sets the transient store in the complete state. |
Report ID. | Description |
---|---|
OSPL-14625 / 00021610 |
Alignment issue causing corrupted builtin-topic data when accessed by the CSharp language binding Mis-alignment with native C structure of QoS policies causing corruption of, for example, the PublicationBuiltinTopicData, leading to invalid values in other members of the struct. Solution: An internal array was marshalled based on its size in bytes instead of its number of elements. This was resolved by fixing the relevant code in idlpp responsible for the CSharp code binding generation. |
OSPL-14578 / 00021638 |
Possible crash when deleting multiple DataReaders with ordered-access QoS. A DataReader with ordered-access QoS shares some administration with its subscriber. When multiple DataReaders exist for the same subscriber, deleting one DataReader would leave behind some administration causing a crash when subsequent DataReaders are deleted. Solution: The code to remove a DataReader was updated to handle this case correctly. |
OSPL-14628 / 00021666 |
Samples with topic-scope coherent presentation QoS not correctly processed by DDSI service. An issue with the identification of (remote) writers with a coherent topic-scope QoS, can cause samples to not get delivered to a matching reader. This occurs only for remote writers when the DDSI2 network-service is used. Solution: The issue was fixed in DDSI2 by setting the correct identification on the incoming data, so it is correctly processed when delivered to local readers by the OpenSplice kernel. |
OSPL-14693 / 00021738 |
Network initial synchronization with a newly detected node may fail. When the networking service detects a fellow node, it will send a sync message which indicates the next packet sequence number that will be sent. However older packets may already been received by the fellow node which will then acknowledges these packets. When receiving an acknowledgement, the sending node should populate the resend list associated with the fellow with the packets that are not yet acknowledged. This does not occur when the expected packet sequence number indicated in the sync message has not yet been sent. Solution: When a node receives the first acknowledgement from a fellow node and the acknowledged sequence number is acceptable (not too old), then the resend list is populated with the packets that are sent later but not yet acknowledged. |
OSPL-14700 / 00021749 |
Listener thread priority policy mix up in ISOCPP2 API When using ISOCPP2 and setting a thread priority policy to SCHEDULE_TIMESHARING, the actual result was that it was set to SCHEDULE_REALTIME, and when setting it to SCHEDULE_REALTIME it was set to SCHEDULE_TIMESHARING. Solution: The thread priority policy mechanism has been adjusted to set to the actual chosen priority. |
Report ID. | Description |
---|---|
OSPL-13974 / 00020849 |
A node configured with master priority 0 may cause an unwanted master conflict to be raised. A node which is configured as aligner and with master priority 0 will become master of the corresponding namespace when at that moment no master with a higher priority is available. When a node with a higher master priority is selected as master, it will notify the other nodes after the master selection. However a namespace message sent by the node that has master priority 0 before it has received the information of the new master may raise an unnecessary master conflict which may cause an unnecessary alignment action to occur. Solution: Namespace message from a node that has master priority 0 will not cause a master conflict to be raised at other nodes. |
OSPL-13958 / 00020837 |
On termination of the durability service, a fatal error may occur when cleaning up resources. The durability service contains listener threads which listen to events and action threads which perform some action. The action threads are associated with a particular listener but may access resources of other listener threads. When the durability terminates, first all the listener threads are stopped and then the resources related to these listener threads are freed. However the action threads are stopped later and could still try to access the resources of a listener. In this case an action thread tries to take a mutex which is already freed, causing an abort. Solution: The action threads are stopped when the associated listener thread is stopped. This prevents that the action threads are still running when the resources are being freed. |
OSPL-13963 / 00020842 |
A delayed durability message may cause that a non-responsive fellow incorrectly is considered alive. When the networking service detects that a node has become non-responsive it will notify this and the durability service will mark the fellow durability service as non-responsive. However it may occur that there are still messages from that fellow durability services still queued either in the networking service or in the reader caches of the durability data readers. When such a message arrives after the fellow has been declared as non-responsive, the fellow is considered alive again, but obviously does not respond and this may cause that the durability service does not become complete. Solution: When a durability fellow is detected to be non-responsive it is placed on a black list for some time and when the durability service receives a message from that fellow durability service, it will check if this message was issued before the fellow was marked as non-responsive. If the message was issued before the time the fellow was marked as non-responsive the message is ignored. |
OSPL-14012 / 00020823 |
Tester: Old browser nodes not removed even when "Show disposed participants" is unchecked. Items representing disposed entities in the tree view of the Tester "Browser" tab are not removed even when the "Show disposed participants" option is cleared. Instead, these tree items are displayed with an orange background. Solution: Tester has been updated to remove these nodes from the tree view, when the "Show disposed participants" option is cleared. |
OSPL-14172 / 00021015 |
All nodes switch continuously from COMPLETE to INCOMPLETE with DMPA enabled. When receiving an update of a namespace from a fellow, the master priority field is not copied to the existing namespace information from that fellow. Solution: The master priority contained in a received namespaces' message is updated in the correspoding namespace information which a durability service maintains for each fellow. |
OSPL-14221 / 00021067 |
The Replace merge may incorrectly remove newer samples from the reader. The replace merge policy uses the timestamp of the request to determine which samples have to be replaced and which instances have to be disposed. However this may cause an incorrect result when newer samples are written after the time of the request but before the aligner has made the snapshot of the data to be aligned. This may also occur when the clocks of the aligner and the alignee are not sufficiently synchronized. To solve this problem the time of the snapshot should be used. Solution: The snapshot time is added to the final alignment message. At the aligner, this snapshot time is used to when applying the Replace policy. |
OSPL-14255 / 00021085 |
In certain circumstances BY_RECEPTION_TIMESTAMP topics (including the builtin topics) may go back in time. BY_RECEPTION_TIMESTAMP topics (this includes all builtin topics that are BY_RECEPTION_TIMESTAMP according to the DDS specification) would always append newly arriving samples to the end of their corresponding instance queue. This would allow them to go back in time if samples would ever be receiver out-of-order. One particular scenario where this could wreak havoc is when builtin topics that are aligned using either durability/dlite (this for example the case when using native networking, where the builtin topics do not get aligned as part of the networking protocol) get disposed before the transient snapshot (in which they are still included) arrives. So in a case like that, you first get a DISPOSE message from a DCPSPublication through the live networking path, followed by the sample preceding it from the transient snapshot, which would result in the DCPSPublication representing a particular Writer ending in the ALIVE state instead of in the DISPOSE state. This could cause Readers to assume there still is at least one ALIVE Writer while in fact there is not, causing their data to stay ALIVE even when this is incorrect. Also mechanisms like the synchronous reliability do not work correctly in scenarios like that: if a DCPSSubscription represents a synchronous Reader then the writing side can be fooled into believing there is still a Reader that needs to acknowledge its data, when in fact this Reader has already left the system. The writing side will then block waiting for an ack that will never be sent, effectively blocking the Writer application indefinitely. Solution: BY_RECEPTION_TIMESTAMP topics will no longer be allowed to go back in time for data originating from the same DataWriter. Each individual instance with samples originating from the same source becomes effectively "eventually consistent". |
OSPL-14256 / 00021086 |
The reception of an old durability message from an already deceased fellow could cause the durability service not to become complete. When a remote node terminates, the networking service will notify the durability service through the heartbeat mechanism. The durability service will then remove this fellow from its administration. However it could be that old durability messages are still queued either in the networking service or in the durability reader cache. This could cause that the durability service thinks the fellow is alive again, which may cause that the durability service waits indefinitely for a response from that fellow. Solution: When the durability service receives an indication that a fellow has died, it will maintain some information about that fellow for some time and mark the timestamp that the fellow has died. When receiving a message from that fellow, it compares the receive timestamp of that message with the recorded timestamp to determine if the message was old or came from a reconnected fellow. |
OSPL-14331 / 00021192 |
With RT networking, high packet loss may cause that all the configured bandwidth is consumed by resending packets. When there is high packet loss or a node is non responding, then the resending of packets will reduce the available bandwidth for normal traffic. As packets are resent point to point, it may be useful to have a configuration option that specifies the bandwidth available for resends, which is not accounted in the total available bandwidth, to allow resends to have less impact on normal traffic. Solution: The MaxResendBurstSize option is added to the configuration to allow, in combination with the ResendThrottleLimit, control of the bandwidth available for resending packets without effecting the available bandwidth for normal traffic. |
OSPL-14357 / 00021504 |
When the durability service detects a large number of fellows in a short time interval the total time to become complete again is longer than necessary. When a durability master detects a new fellow that is an aligner for the namespace, it will issue an alignment request to that fellow. During that time the transient store will be in the incomplete state and return to the complete state when the alignment data has been received. When a large number of aligners connect in a short time interval, each aligner will be handled one by one, which causes the total time to become complete again to be long. By combining the alignment actions to all these aligners, this would reduce the time the transient store will be in the incomplete state considerably. Solution: When a durability master detects more than one aligner connecting within a certain interval, the resulting alignment actions are combined into one. |
OSPL-14361 / 00021506 |
The FellowGarbageCollectionPeriod durability configuration parameter is incorrectly ignored. When parsing the configuration file, the xml path to the FellowGarbageCollectionPeriod parameter is wrong, causing the value to be ignored. Solution: The code correctly parses the FellowGarbageCollectionPeriod parameter. |
OSPL-14373 / 00021511 |
Add a check to terminate the durability service when not making progress. When the durability service detects a conflict, the state of the durability service is set to incomplete. As a security measure it should be possible to terminate the durability service when resolving the conflict does not make progress and the durability service remains in the incomplete state. For that purpose, it should be possible to set a timeout on the time to resolve a conflict. When that timeout expires the durability service should terminate with a fatal error. Solution: The configuration parameter ConflictResolutionTimeout has been added, which enables the user to set a maximum timeout on the resolution of a conflict. When this timeout expires, the durability service terminates with a fatal error. |
OSPL-14442 / 00021561 |
A crash may occur when using the DataReaderView. There appears to be a race condition between operations on the DataReader and an associated DataReaderView. When a sample is purged from the DataReader it will also be purged from the associated DataReaderView which may interfere with a read or take operation in the DataReaderView. Solution: Extra locking is added when performing purge operations from the DataReader on the associated DataReaderView. |
OSPL-14448 / 00021564 |
An instance may be incorrectly removed by durability when applying a REPLACE merge. When durability applies a REPLACE merge, it uses the snapshot time when the aligner had made the snapshot of the alignment data. The aligner included the time of the snapshot in the alignment data sent to the requester. When applying the REPLACE policy, the snapshot time is converted to a local which is used to purge the transient store of the relevant data. When an application at the aligner has written a sample just after the aligner had made the snapshot and this sample arrives before the alignment data, then this sample may incorrectly be purged from the transient store because of the inaccuracy in the conversion from the snapshot time, which was the wall clock time at the aligner at the moment the snapshot was made to the local time which is using a monotonic clock source. Solution: The REPLACE merge uses the snapshot time which was included in the alignment data directly to purge the transient store instead of converting it first to a local time. |
OSPL-14477 |
The dispose_all function may interfere with a REPLACE or CATCHUP action performed by durability. The dispose_all functionality and the REPLACE and CATCHUP actions performed by durability use a common resource stored in transient store which is used to dispose the relevant instances. When a dispose_all is performed during a REPLACE of CATCHUP action the dispose_all may overwrite this information which may cause the result of the REPLACE or CATCHUP to be incorrect. Solution: The dispose information used by the REPLACE or CATCHUP policy is maintained by the durability service and not temporarily stored in the transient store. |
OSPL-14527 / 00021594 |
Segmentation fault in WaitSet_wait The DDS_WaitSet_wait will crash when the condition sequence maximum is set but the buffer and the length are 0. Solution: The maximum size of the condition sequence is set to 0 when the sequence buffer is null. |
OSPL-14538 |
Sample may be permanently lost when rejected locally When a DataWriter tries to transmit a sample, it will try to deliver it in the following order:
Solution: A sample that is rejected by the local durability service will skip all other entities but gets retransmitted to all categories by the resendManager. This is to enforce eventual consistency between nodes where data that cannot be aligned can also not become available to other local entities. A sample that is rejected by the local DataReader(s) will be allowed to be delivered to the network services, but gets retransmitted only to the local DataReaders. |
OSPL-14553 / 00021620 |
The networking service option to send some durability messages point-to-point may cause message loss. When enabling the networking service option to improve durability message communication to send durability message which are sent to one particular fellow, may cause durability messages to be lost causing the durability service to fail. When using this option, durability messages may arrive at the durability reader out of order because now point-to-point and multicast message use different networking partitions. A change in the handling of messages by a reader which uses by reception timestamp causes that messages from the same writer are also checked on sequence number. Messages with sequence number out of order are rejected. Solution: When the networking service may use point-to-point communication for durability messages, the networking service will reorder the received durability messages by assigning increasing sequence number to the received messages. |
OSPL-14554 / 00021621 |
Durability doesn't resolve master conflict after the stop of several nodes. The durability service monitors the system heartbeat of other nodes to detect if a fellow node is alive. Both the spliced daemon and the networking service update the system heartbeat associated with remote nodes. It appears that there is a race condition between the update of this system heartbeat by the networking service and the spliced daemon. A reset of a reliable channel in the networking service may cause that the networking service unregisters the corresponding heartbeat and then writes it again. When the spliced daemon reacts on the unregister of the heartbeat by disposing the heartbeat the durability service may receive the dispose after the alive heartbeat written by the networking service. This causes that the durability service does not detect when the remote node at some later time terminates. When this happens during an ongoing alignment the durability service may stay waiting indefinitely. Solution: The garbage collection performed by the spliced daemon when a remote node terminates is directly triggered by the not alive state of the corresponding system heartbeat without first having to dispose this heartbeat. |
OSPL-14564 / 00021633 |
When several nodes restart the durability alignment takes too much time. During initial alignment the durability service will receive group information for the running nodes. To be able to align these groups the groups have to be locally created. For that purpose either a local application should have created the group or the corresponding DCPSTopic should be locally present to enable the durability service to create the group. In this case the running nodes are configured with master priority 0 which causes that the durability service does not initially ask the present nodes for the builtin topics available to them. This may delay the initial alignment of that node because it has to wait for the presence of the DCPSTopic. Solution: The master of the builtin namespace initially sends alignment requests to already running nodes including nodes configured with master priority 0. |
OSPL-14570 |
Syntax error in MATLAB class Vortex.Topic An extra parenthesis is present in the MATLAB class Vortex.Topic which causes a compilation error when the class is used from a MATLAB script or class. Solution: The error has been fixed. |
OSPL-14580 / 00021621 |
Durability incorrectly uses the sample insert time to check if a sample is outdated. When a fellow node stops, the durability service might still receives message from that fellow which could make the fellow incorrectly alive again. For that purpose, some information about a deceased fellow is retained for some time. To check if a message is outdated, the insert time of the sample is used - this is incorrect. It would be better to use the allocation time because the allocation time is set just after it has been de-serialized. Further messages which are inserted after the durability service detects that a fellow has gone are queued for some time, because they may belong to a new lifecycle of the fellow. The function to check if a sample is outdated incorrectly returns true when it cannot find the fellow. Solution: The allocation time of the a received durability message is used to determine if the message is outdated. |
OSPL-14591 / 00021604 |
DBMS Connector does not receive Exclusive Ownership messages If a topic has its Ownership QoS set to Exclusive then any message using this topic is not read by the DBMS Connector. Therefore no row is added to the database. Solution: The DBMS Connector code copies the QoS from the found topic to add to the DataReader used to read the messages. The ownership QoS was not being copied to the DataReader QoS and so only the default ownership value (Shared) was accepted. |
OSPL-14602 |
Deadlock in Durability during termination During termination of durability, the service can end up in a deadlock when an action is present in the actionQueue. During termination of durability, all the actions are removed from the actionQueue. When doing this, the action is executed one last time, but in the case where durability is already in terminating state, the execution of the action can cause durability to deadlock. Solution: A check is added to not execute any action when durability is in terminating mode, as the action has no use to run at this stage. |
OSPL-14462 / 00021578 |
Missed Heartbeat info log message missing on disconnect When a node is disconnected a Missed Heartbeat for node xxx message was written into the ospl info log. This message was removed for performance reasons resulting in users being able to detect disconnects by scanning the info log file. Solution: A new Disconnect detected for node xxx message is now written to the info log file once a disconnect is detected. |
Report ID. | Description |
---|---|
OSPL-14132 |
Limit the number of builtin topics that a non-master will align to the durability master. The durability service that becomes the master of the builtin namespace has to acquire a total view of the builtin topics that exists in the system to enable the master to align each node. For that purpose when the master detects a new fellow it will request the builtin topic from that newly detected node. For the builtin topics that correspond to DDS entities except topic (DCPSPartition, DCPSSubscription, DCSPPublication, etc) it is only needed to align the master with those samples that correspond to entities on the node itself. Solution: When receiving an alignment request for a builtin topic group and the node is not the durability master for that namespace return only those samples that correspond to locally created DDS entities. This does not apply to DCSPTopic. |
OSPL-11051 / 00018443 |
ddsi2e may crash when more than 2 to the power 24 entities are created. ddsi uses 24-bit entity ids. These entity ids should wrap around, but fail to do so. This causes the function to allocate a fresh entity id to fail when it has reached 2^24 entity ids. Subsequent dereferences of this entity id can cause a crash. Solution: Entity ids can now roll over, allowing reuse of slots of previously freed entities. |
OSPL-13932 / 00020817 |
Resurrection of fellows by processing outdated namespaces Durability services exchange namespaces with each other. There is a time window where a durability service (say A) receives a namespace from another durability service (say B), but before this message is processed durability service B becomes non-responsive. Because B becomes non-responsive, A will remove B from its administration. If a little later the pending namespace message from durability service B is processed by A again, then A will resurrect B. This is undesired, because B is not around anymore. Solution: A mechanism has been implemented so that A remembers the time when B became non-responsive. A uses this time to determine whether the namespace that it processed is outdated or not. |
OSPL-13996 / 00020855 |
Provide an option to the durability service to allow alignment data to be injected atomically in the data reader . When the durability service applies the replace policy it will purge an instance on the first sample that is injected, followed by the remaining samples of the same instance. Thus the reader could be triggered by each individual sample that is injected by the durability service. For example, in case the instance is already disposed and the alignment data still contains both a valid sample and a disposed sample, then on the injection of the valid sample the instance will become alive again because the sample replaces the contents of the instance and somewhat later the disposed sample is injected by the durability service making the instance disposed again. Note that this behavior is a consequence of using the replace policy. In case this behaviour is not desired, an option has to be provided that allows the injection of the alignment data in the reader cache to be atomic. Solution: The option InjectMode with the value Blocked or Unblocked is added to the durability configuration to allow alignment data to be injected atomically in a reader or not. By default this option is set to Unblocked mode. |
OSPL-14102 / 00020939 |
Abrupt reboot on a node may cause high latencies when using RT networking. When the networking service does not receive the shutdown notifications of a remote node, for example when that node crashes, then a reliable channel will continue resending to that node until either the discovery heartbeat times out or the max retries is exceeded. This may cause that the other nodes experience long latencies because the resend will leave less bandwidth for normal traffic. Solution: When no acknowledgements are received from a node for some time, the number of resends is reduced to that node until acknowledgements are received again. The configuration parameter MaxAckDelay specifies the maximum expected acknowledgement latency. When this threshold is exceeded the number of resends is reduced. |
OSPL-14124 |
Allow the durability service configured with master priority 0 to detect a master conflict with a higher priority master at an earlier stage. When a durability service is configured with master priority 0 it will become locally master when there is no durability service with a higher master priority present. When a node with a higher master priority has selected itself as master than the durability service, a lower master priority should detect a master conflict which should then resolve to the new master. A condition for raising a master conflict is that the conflicting master has passed a certain state. However when the node with the highest master priority becomes master, it is still in a state that does not raise a master conflict, which causes that master switch at the node that loses mastership to be delayed, which may cause unnecessary alignments to occur. Solution: When using master priorities to select a master then a master conflict is raised when the node that caused the master conflict is either in the master selection state or in a higher state. |
OSPL-14139 / 00020976 |
Simulink: Usage of 'strnlen' in 'dds_participant_wrapper.c' file prevented code compilation. In the OpenSplice Simulink API, the usage of 'strnlen' in the 'dds_participant_wrapper.c' file prevented the compilation of the generated code of Simulink models. Solution: The problem is now fixed. |
OSPL-14142 / 00020979 |
The setting of InProcessExceptionHandling is ignored when handling an exception signal asynchronously sent. When handling a signal, the signal handler classifies the signal in three categories, which are exit signals, asynchronous exception signals and synchronous exception signals. The InProcessExceptionHandling parameter determines the behavior when handling an exception signal. The problem is that the setting of this parameter is ignored when handling an asynchronous exception signal sent from outside the process and handles it as if it was an exit signal. Solution: The signal handler handles both exception signal categories the same way. |
OSPL-14143 / 00020978 |
An asymmetrical disconnect may cause the durability service to crash. The durability service detects an asymmetrical disconnect when it receives an unexpected capability message from a fellow. In this case, the durability service will remove all namespace knowledge about that fellow. However, a race condition exists when another thread is currently processing a namespace message from that same fellow. Solution: The thread that handles namespaces messages is not allowed to access the stored namespace information without the mutex lock. |
OSPL-14148 / 00020988 |
Memory leakage caused by messages no longer being freed after alignment. After a reconnect between nodes and after alignment, the new catchup mechanism will visit instances and messages to evaluate if anything needs to be removed because is has been disposed and purged during disconnect. While visiting messages the refcount is increased but afterwards never decreased and that leads to leakage of messages. Solution: Increasing the refcount is actually not required and is removed. |
OSPL-14149 |
Configured catchup does not work for Dlite Due to invalid handling of the specified catchup directive the configuration had no effect. Solution: The bug is fixed and a regression test is implemented. |
OSPL-14151 / 00020989 |
In RT networking the synchronization of sequence number between sender and receiver may fail after a reconnect. When one node decides that another node has not responded in time, then it will force a disconnect followed by a reconnect when it receives a new message from that node. The reconnect will cause that a sync message is sent to the other node. Because the other node has not noticed the same disconnect, the reception of the sync message causes a reset of the reliable communication. A problem occurs when acknowledgement messages received after the reception of the sync (reset) are handled before the reset is handled. This may cause that packets are already removed from the reliable administration which will never be resent, causing the sync message sent in response to the reset message to contain the wrong information. The acknowledgement information and reset information is notified by the receive thread to the send thread by different message queues which may cause that these messages are not handled in the correct order. Solution: The reset sync message is always handled before handling the acknowledgement received after the reset message. |
OSPL-14152 |
Potential slowing or stalling historical data alignment for Dlite When a node that wants to align other is disconnected its state known by others is not cleared. Others can potentially wrongly wait for this node to start alignment. Solution: A fix has been applied that clears the state when a disconnect is detected. |
OSPL-14154 |
Catchup wrongly disposes instances after reconnect When a node has configured catchup, it may wrongly purge/dispose instances after a reconnect. Solution: Additional information is exchanged between dlite services that fixes making wrong decisions. |
OSPL-14156 |
The catchup initiated disposeAll memory leak. A disposeAll followed by a catchup alignment after a reconnect performed by the Durability service will become a memory leak because it is removed but not freed. Solution: Add code to free the disposeAll when removed. |
OSPL-14170 / 00021011 |
The progress of the durability service is dependent on the local knowledge of the topics to be aligned. When a durability service is responsible to act either as an aligner or alignee for a particular group (partition.topic combination), then either the durability service should be able to create the corresponding group or the group should have been created locally by an application. For that purpose the associated topic information should be known locally, which is provided through the DCPSTopic builtin topic. Thus durability progress can be delayed until the DCPSTopic has been aligned or made available through an application. Note that this problem does not occur when using the ddsi service because the ddsi service will align DCPSTopic. Solution: Using the option to have the networking service align the builtin topics will resolve this issue. |
OSPL-14175 / 00021028 |
When RT networking does not directly detect that a remote node has stopped it may cause an increase of the communication latency. When a node stops, the RT networking service will inform the other nodes by sending a number of discovery heartbeats indicating that the node is stopping. However, when the node crashes or these stop heartbeats are not received then a reliable channel may try for some time to resend packets to the stopped node, depending on the configuration settings. These resends will reduce the bandwidth available for normal communication, causing extra latencies. Solution: The number of resends is reduced when acknowledgements from a node are not received within a certain time. |
OSPL-14177 |
Order of elements in XML configuration is not preserved but matters in some cases. When parsing the configuration file, the elements contained in the configuration file are stored in an arbitrary order. For most configuration elements this is not important but for some elements the order can be important. For example, the PartitionMapping element which may contains wildcards for the topic-partition selection, so the order in which the wildcard expressions are evaluated may be relevant. Solution: The order of the configuration elements is maintained. |
OSPL-14190 |
Add option to RT networking to allow the dynamic network partition feature to be disabled. To support the dynamic network partition feature, the networking service uses a number of builtin transient topics. To limit the overhead of the use of these special builtin topics, especially in systems containing a large number of nodes, it would be nice to have an option to turn this feature off. Solution: The parameter DynamicPartitionSupport has been added to the networking configuration. |
OSPL-14191 |
Add option to the durability service not to send namespace and group information for namespaces that are not configured as aligner. Currently the durability service will transmit to other fellows all the configured namespaces and all the corresponding group information. However in particular system configurations, it is not always necessary to transmit the namespaces and groups for which this durability service is not configured as an aligner. Furthermore, the builtin groups are always present and are not needed to send the builtin group information and have the builtin group information created implicitly on the detection of a fellow. Solution: The general parameter AligneeReportNamespace is added to the durability service configuration to control whether namespace and group information for which the durability service is not an aligner is sent to fellow durability services. |
OSPL-14194 |
Transient instance lifecycle state change might be missed after reconnect When you get disconnected from a Transient Writer, you expect the lifecycle states of the instances originating from this Writer to become NOT_ALIVE. Likewise, after a reconnect these lifecycle states should revert back to the ALIVE state (if the Writer did not dispose/unregister them during the disconnect, that is). However, if the Reader has already read (not taken!) the sample from before the disconnect, upon reconnection it will discard the sample meant to revive the instance as a duplicate of the sample that was already in its cache (albeit in a READ state). This will correctly set the instance state to ALIVE, but probably your Reader may not notice this since in typical use-cases Readers use the NOT_READ Mask to read data that is in the ALIVE state. In this particular example when reading ALIVE, NOT_READ data, your revived instance will not show up because it is still in the READ state. Solution: In case of a reconnect, the sample used to revive the instance will always show up as NOT_READ, even if that same sample was already READ prior to the reconnect event. This is to make sure that you will actually be notified about the lifecycle state change that your instance went through, going back from NOT_ALIVE to ALIVE. |
OSPL-14223 |
Incorrect alignment of late joiners when simultaneously being aligned and receiving live data by Dlite Alignment is based on state differences and after alignment the completeness is verified by comparing the data sequence number ranges between nodes. However, in case of continuously publishing new data the state of nodes will always be in flux, making comparison unreliable. The problem is that state exchange and alignment data exchange are two separate topics that will never reflect the same state if published at different times. Solution: The state of the aligner is no longer published separately from alignment data but is incorporated into the alignment data as one consistent set. The alignee can now merge the correct state according to the data it received. |
OSPL-14232 |
Potential instance revival bug in case of multiple Writers per instance When a reader instance is registered by more than one Writer and all its Writers get disconnected, only the last disconnecting Writer will insert a NO_WRITERS message. However, if the first Writer to reconnect is not the Writer that inserted the NO_WRITERS message, it is not allowed to insert samples older than the NO_WRITERS sample (in case of BY_SOURCE_TIMESTAMP destination ordering that is) to revive the instance back into the ALIVE state. This is wrong: since a NO_WRITERS message is only inserted when ALL participating writers disconnected, any Writer should be allowed to revive the instance. Solution: Any Writer is now allowed to revive an instance, not only the Writer that inserted the NO_WRITERS message. |
OSPL-14243 / 00021075 |
Spliced may refuse to start after non graceful termination, When spliced is non-gracefully terminated (for example by suddenly rebooting the machine without bringing spliced down first) its key file (containing all relevant details about the Domain it is hosting) stays behind. If this key-file is not removed by the user, the ospl tool used to start a new instance of spliced might wrongly assume the keyfile is still relevant and therefore refuse to start spliced if it has reason to believe the previous incarnation of spliced is still alive. Consider for example the following scenario: the ospl tool starts a spliced process with processId Solution: The ospl tool will now not just check whether the |
OSPL-14251 |
The networking service may crash when it receives an acknowledgement after a long delay. The networking service maintains a list of fragments for joining fellows (the late joining list). Transmitted fragments are stored in this list for some time to provide reliable communication to a joining fellow which may initially have missed a number of fragments. This late joining list may become empty when the node is not sending data and all fragments on the list have timed out. When receiving a first acknowledgement from a fellow node which has endured a very high delay, the networking service may crash accessing the empty late joining list. Solution: When handling a first acknowledgement from another node, the late joining list is checked not to be empty. |
OSPL-14252 / 00021068 |
DDSI2 udp transmit thread potentially blocking indefinitely on Windows The DDSI2 service on Windows can block and eventually get terminated due to lack of transmit-thread progress. This is due to a Windows API call that should always return but in a specific deployment reported by a customer, was observed to block infinitely. Solution: The code was changed to use a configurable timeout (//OpenSplice/DDSI2Service/Internal/UDPWriteTimeout)) which defaults to 5s. This allowed the service to work correctly in the customer-reported scenario, while still detecting unexpected or irregular delays when the timeout is exceeded. |
OSPL-14253 / 00021081 |
MATLAB: "statement is incomplete" error when topic type reference is invalid When creating a DDS Topic with the MATLAB API, you must pass the Vortex.Topic class constructor a reference to MATLAB class generated by idlpp. Typically, such a reference is created by the MATLAB "?" (question mark) operator. Note, however, that this operator silently returns an empty class array should the referenced name not be found. This could occur because of a typographical error, or because the referenced class is not in the MATLAB path. The Vortex.Topic constructor does not detect such invalid class references, and ultimately causes an exception with the text: "Error: The statement is incomplete." Solution: Argument checking in Vortex.Topic has been improved to detect the result of an invalid class reference operation, and to issue a clearer error message in this case. |
OSPL-14263 |
When the name of a DLite service exceeds 128 characters a crash can occur DLite services have a name. These names are used in the log files that are generated by DLite services. A buffer of 128 characters was internally used to contain the name. This leads to a crash in case the name exceeds 128 characters. Solution: The buffer is now dynamically allocated based on the length of the name. |
OSPL-14276 / 00021099 |
The durability service may crash when a fellow node is disconnected and directly reconnected again. Initially the role field of a fellow namespace is unset. The role is set when the first namespace's message is received from that fellow. When processing a namespace's message from a fellow and when that same fellow is disconnected and shortly thereafter reconnected, the namespace administration of that fellow is initialized again and the role field becomes unset. However the processing of the old namespaces messages expects the role field still to be valid. Solution: When accessing the role field of a fellow namespace, it is checked if this field is valid. |
OSPL-14311 |
Missing Protobuf header files In the case of an OpenSplice release without java tooling support the protobuf header files are missing from the release. Solution: The protobuf header files are now included in releases without java tooling support. |
OSPL-14326 / 00021133 |
Reliable communication stalled after an asymmetrical disconnect/reconnect when using RT networking. When a disconnect/reconnect occurs a reset message is sent to the remote entity. This reset should reset the synchronization of the sequence numbers between sender and receiver. In the case where the expected packet indicated by the reset message was received before the reset message and is still on the out-of-order list, it could occur that the out-of-order administration is not updated correctly, causing the processing of the out-of-order list to stall. Solution: When a reset message is received, update the out-of-order list correctly and possibly restart processing of the out-of-order list from the expected sequence number. |
OSPL-14329 / 00021058 |
Possible crash in repeated DDS Security plugin loading mechanism In single-process mode, creating and deleting participants causes service and plugin libraries to be loaded and unloaded on the fly. In the DDS Security plugin mechanism of the DDSI2 service, initialization code is not robust against executing more than once and could crash when the location of libraries changes between the first and second time they are loaded into the process. Solution: The initialization code of the plugin-loading mechanism was fixed to allow multiple executions, supporting restarting the DDSI2 service within the same process. |
OSPL-14334 / 00021196 |
The networking service may crash when using the option to manage the builtin topics. When handling a topic request from a fellow node and that fellow node has already been removed from the administration because it became disconnected, a trace message tries to dereference an invalid pointer. Solution: The trace message is corrected. |
OSPL-14360 |
struct and union definitions that have the same name as modules in which they are defined can cause crashes. Internal processing (deserialization) of the XML type descriptor will try to process the module as if it was the struct definition and will lead to a crash. The problem is that an internal search operation during deserialization searches the wrong scope and returns the module with the same name. Solution: The search operation is replaced by one that searches the correct scope |
OSPL-14411 / 00021533 |
DLite in combination with native networking may leak range info belonging to builtin topics. DLite keeps an internal administration of the ranges of sequence numbers it received per individual writer. If DLite is used in combination with native networking (which by default doesn't take care of aligning the builtin topics like ddsi does), this range administration includes the writers for the builtin topics. However, due to a bug in the initialization of the transient store, the tombstone_cleanup_delay was set to infinite for the builtin topics, causing their range info to never be cleaned up, effectively causing a little memory leak for every participant that leaves the system. In scenarios where participants are coming and going all the time, this little memory leak might eventually accumulate and cause the system to run out of memory. Solution: The tombstone_cleanup_delay for builtin topics is now set to 5 seconds. This means that all range info describing the builtin topics from a particular participant will be cleaned up 5 seconds after that participant leaves the system. |
OSPL-14413 |
The delivery thread of RT networking is accessing the shared memory directly without setting the protection count. When the SMP option is enabled in RT networking, a separate thread is created which writes the received sample (messages) into the shared memory. However, this thread does not set the shared memory protection count. This causes the situation that when a crash occurs when writing a sample into the shared memory segment and an exception occurs, the exception (signal) is being handled as a normal user kill exception (signal). In this case the networking service detaches from the shared memory before re-raising the exception (signal). Solution: The main part of the asynWriter (delivery) thread is run with the shared memory protection count raised. |
OSPL-14443 / 00021555 |
C API has potential issues when dealing with UserDataQosPolicy, GroupDataQosPolicy or TopicDataQosPolicy, The C API has some bugs in its handling of a sequence of octet that is used in the UserDataQosPolicy, GroupDataQosPolicy or TopicDataQosPolicy. These bugs could either cause a double free (resulting in a crash) or a memory leak when invoking get_qos, set_qos, get_default_ Solution: The code that handles the sequence of octet has been corrected, thus removing the potential double free and the potential memory leak. |
OSPL-13924 |
Remove hard idlpp dependency for dynamic types in Python API Added a friendly warning in the Python package installer to remind customers when idlpp is not available. Added dynamic type construction with python dataclasses (available in Python version 3.6+) for those who cannot pregenerate python classes with idlpp Solution: For customers wanting to work with dynamic types in python but who have no idlpp at runtime: take a look at tools/python/examples/example5.py for a demonstration of this. |
OSPL-14036 / 00020881 |
Added human readable timestamps in Dlite tracing Dlite tracing reports are timestamped with the internal representation of the system time (wall clock). However, to improve readability and comparison with other log files it is desired to add a human readable and standardized format. Solution: Each line is prepended with an additional representation of the timestamp with a format similar to other logfiles. |
OSPL-14123 |
Allow the option to determine the master on the durability service that first started to be applied to transient namespaces. When using master selection based on priorities and when there are several nodes which have the highest priority, then the master is chosen first on the namespace quality and then the system-id. For namespaces configured as transient, the quality will always be 0 and thus the node with the highest system-id will become master. Thus when a new node is started, it may take over the mastership of these transient namespaces which will cause new alignments to occur. However it could be preferred that in this case, the first selected master will remain the master for the namespace. For persistent namespaces there is the option to select the master based on the node that was first started in case the priorities are the same. By applying this option also to transient namespaces, it will ensure that a newly started node which has the same master priority can not take over mastership. Solution: The option to select a master based on the node that was first started is also applicable to a transient namespace. |
OSPL-14141 |
durability_migrate logging and v6.6 KV store support The additon of some more descriptive logging output and support for durability KV stores produced by the Opensplice 6.6 release is desirable. Solution: Some more descriptive logging output and support for durability KV stores produced by the Opensplice 6.6 release have been added. |
OSPL-14298 / 00021107 |
Python API: improved performance of ddsutil.get_dds_classes_from_idl when retrieving multiple topics from one idl file. The Python API function ddsutil.get_dds_classes_from_idl would re-parse an idl file for every topic type you retrieved from it. Solution: ddsutil.get_dds_classes_from_idl now caches the parsing results of an idl file. Pass the `force_reload=True` optional keyword parameter to disable this new behavior. |
OSPL-14403 |
Improve the resend mechanism of RT networking. The current resend mechanism in RT networking walks each resolution tick over all the possible packets in both the node administration and in the late joining list. This is not very efficient. Further resends are performed when an acknowledgement is not received within the time specified by the recovery factor. However, in systems where the acknowledgement latency can be very unpredictable, for example systems that are using VMs, it could be better to only perform resends when the receiver indicates that it has packets missing. An option should be added to allow the resend mechanism to operate in either mode: one mode that will perform a resend when a packet is not acknowledged within the resolution time and one mode that only performs resends when the received ack messages indicate gaps of missing packets. Solution: A time ordered queue is used to schedule the packets that potentially have to be resent when an acknowledgement is not received in time. When an acknowledgement message indicates that some packets are missing, these packets are resent first by moving them to the head of the resent queue. |
Report ID. | Description |
---|---|
OSPL-14113 / 00020953 |
Idlpp for Python API creates circular dependency in the generated file. In the OpenSplice Python API, when having an IDL which has a module B inside module A and using something from module B (e.g., an enum) inside module A, it results in a circular dependency during the import of module A. Solution: The problem is now fixed. |
OSPL-14065 / 00020912 |
Incorrect inconsistency detection in the DLite differential alignment protocol. The differential alignment protocol concluded wrongly that not all data was received from writers that didn't published any data yet. This could prevent the wait_for_historical_data from unblocking which can cause numerous application issues. Solution: The protocol logic is fixed by excluding publishers from differential alignment calculations until they have published data. |
OSPL-14128 |
Race condition in management of instance lifecycle states In the 6.11.0, a new mechanism was introduced that allowed the spliced to revive instances that lost liveliness due to a disconnect when their writer was re-discovered. However, this caused a race condition between the spliced trying to revert the instance back to its state prior to the disconnect, and durability/dlite, which are trying to update the instance to its latest state. Both mechanisms should have been commutative, but in certain scenarios they were not, and this could cause the instance to end up in the wrong life cycle state. Solution: Both mechanisms are now fully commutative, and the resulting instance lifecycle state is now eventually consistent with the rest of the system. |
OSPL-13652 / 00020635 |
Add missing python API methods and ensure the subscriber partition setting is functional. Added methods to set Qos, read_status and take_status. Ensured that setting the partition on creation of a subscriber does function and included a test for this. Solution: You can now change the Qos policies of entities (but only those allowed by DDS). You can read/take status of an entity and set the partition of a subscriber by Qos, for example via the XML QosProvider. |
OSPL-13665 / 00020647 |
Issue with invalid samples when using read condition causes readers to get stuck unable to read samples. In specific cases when a reader has both instances with invalid samples and others with valid samples, if samples are read using a condition on eg. view, sample or instance states, which an instance with invalid sample(s) doesn't meet, no other instances are considered and a 'no-data' result is returned to the application. Note this applies to operations such as take_w_condition but also to conditions in waitsets. Solution: Processing instances with invalid samples contained a bug in an return code causing the implementation to stop iterating instances and return 'no-data' result to applications prematurely. |
OSPL-13743 / 00020702 |
Possible alignment mismatch when an asymmetrical disconnect occurs during alignment When nodes get reconnected after being disconnected, a request for alignment data is sent. When there is an asymmetrical disconnect AFTER the aligner has received the request but BEFORE the aligner has actually sent the data, then the alignee drops the request but the aligner does not. When the asymmetrical disconnect is resolved, the alignee sends a new request for alignment data to the aligner. It now can happen that the aligner sends the alignment data of the FIRST request to the alignee, and the alignee considers this as the answer to the SECOND request. When the alignee receives the alignment data to the SECOND request, the data gets dropped because there is no outstanding request anymore. This can lead to an incorrect state whenever the alignment set has been updated between the first and the second request. Solution: The answer to the first sample request is not considered a valid answer to the second request any more. |
OSPL-13787 / 00020504 |
Python: "dds_condition_delete: Bad parameter Not a proper condition" entries in ospl-error.log In some circumstances, the ospl-error.log can contain errors with the text "dds_condition_delete: Bad parameter Not a proper condition", even though customer code appears correct. This is due to unexpected interactions between the Python garbage collector and the underlying C99 DCPS API used by the Python integration. An attempted fix (OSPL-13503, released in 6.10.4p1) was later reverted in 6.11.0 (OSPL-13771) because it caused a memory leak. Solution: The issue has been fixed. The fix tracks implicit releases of C99 'condition handles' by tracking garbage collection of the parent DataReader class, and prevents duplicate calls do dds_condition_delete in those cases. Note that customer code can still cause an entry in ospl-error.log, but only when code explicitly calls DataReader.close() prior to explicitly calling Condition.delete(). In such a code sequence, an error message is a reasonable expectation. |
OSPL-13869 / 00020756 |
Long delay to get read and write functions result for large sequence types when using Python API When using OpenSplice Python API in the applications, there was a delay to get the results of the read and write functions for sequence types. The problem was only apparent with large sequences: ~ 1MB or so. In the Python API code, it was looping through the list in data construction, serialization and deserialization. Thus, it was taking a long time to get results back from the read and write functions for large sequence types. Solution: For sequence types, the Python API code is now updated to generate array.array in data construction and deserialization if the sequence type is supported by the built-in array module. The serialization process for the sequence types is also updated to skip the looping through of the data if it is an instance of array.array. The updated code reduces the read and write functions time for the sequence types that are supported by the built-in array module. It is essential to regenerate the code from the IDL file using idlpp in order to achieve performance improvements. The unregenerated code will continue to execute correctly. Please note that the large sequences of types not supported by the built-in array module (enum, boolean, string, char and struct types) will not see any performance improvements. |
OSPL-13914 / 00020791 |
A delay in the scheduling of the durability service threads may cause the spliced daemon to consider the durability service as died. The durability service has to report it's liveliness regular with the spliced daemon. When the durability service does not report it's liveliness within the lease period the spliced daemon considers the durability service as died. The durability service contains a watchdog mechanism to monitor the liveliness of the threads when this watchdog finds a thread non-responding the durability service will delay the liveliness notification to the spliced daemon. When a durability thread goes to sleep it informs the thread watchdog accordingly. However a high CPU load may cause that a thread sleeps longer than expected. In this case at one location in the code the sleep time reported to the thread watchdog is to small which may cause that the thread watchdog considers a thread as non-responsive when the sleep time is much longer as expected. Solution: The sleep time reported to the thread watchdog is increased to take into account scheduling delays caused by a high CPU load. |
OSPL-13926 / 00020810 |
The ospl stop command may block. When an 'ospl stop' or 'ospl stop -a' command is issued then the ospl tool will search for an associated key file(s). When the key file indicates that spliced daemon is still initializing the ospl tool will wait until spliced reports that is has become operational. This is done to prevent problems when the ospl tool tries to terminate the still starting spliced daemon. However when a spliced daemon was killed during it's startup it may leave a key file which is still indicating that it is initializing. When the 'ospl stop' command finds such a key file it will wait indefinitely. Solution: The 'ospl stop' command finds a key file indicating the initializing state it will check if the corresponding spliced daemon is still running. When that is not the case it try to cleanup the resources left by the killed spliced daemon and remove the key file. |
OSPL-13954 |
MATLAB: IDLPP generates incorrect code for sequence of structs For the Matlab language, IDLPP generated in correct code to serialize (write) a sequence of structs, resulting in an exception during sample writes. Solution: IDLPP has been updated to generate the correct code. |
OSPL-13964 / 00020844 |
Instance liveliness sometimes incorrectly set to NO_WRITERS for late joiners When a late joiner joins an already running system, the TRANSIENT samples already published by the other nodes get aligned by either durability or dlite. The alignment consists of two phases:
Solution: A Writer whose discovery is delayed will now correctly update the liveliness of instances for which samples had already been received prior. |
OSPL-13973 |
When the call to DDS_DomainParticipant_find_topic() times out no error message should be reported One of the way to acquire a topic is to call DDS_DomainParticipant_find_topic(). When the topic cannot be found within the specified duration an error message was generated in ospl-error.log. The generation of such error message is incorrect, because not being able to find the topic is legitimate behaviour. Solution: The error message is not generated any more. |
OSPL-14049 |
Serializing big samples on Windows may consume a lot of CPU when using ddsi The ddsi serializer was serializing into 8Kb blocks, reallocating to an additional block if not big enough. For very big samples this potentially resulted in a large number of realloc operations, which on Windows often resulted in memcopying one memory area into another before proceeding, consuming a lot of CPU in the process, which as a consequence was impacting the network latencies quite dramatically. Solution: Windows now uses a more efficient algorithm to reallocate memory, and the ddsi derializer now converges to its eventual size in fewer iterations. |
OSPL-14066 / 00020914 |
Improve handling of unhandled application exceptions in Windows. When an exception occurs in application code and that exception is not handled it will terminate the application. When this occurs and there are still threads busy in the OpenSplice shared memory segment then it can not always be ensured that the shared memory is still in a consistent state and in that case the spliced daemon will stop OpenSplice. For Posix systems this is handled by the signal handler of OpenSplice to protect the shared memory. Solution: For Windos an unhandled exception handler is added which tries to stop or let threads leave the shared memory segment when an unhandled exception occurs in application code. |
OSPL-14067 |
When the durability config specifies master priority selection on several namespace then it may occur that master selection does not converge. When master selection is based on master priorities then the node with the highest priority will win. However when there is more than one node with the highest priority the quality of the namespace is used to select the master. When a master is selected the node will set it's own quality to that of the master. However this quality was by mistake also set on other namespaces that use master priority selection. This could cause that master selection on some namespace became unstable. Solution: When a master has been selected for a namespace the quality of the master is copied to the corresponding namespace and not to all. |
OSPL-14079 |
When a user clock is used, the DLite service uses the user clock instead of the wall clock in their log files When a user clock is configured, the services use the user clock instead of the wall clock in their log files. Sometimes customers want to use the wall clock instead of the user clock in their logs, because the wall clock better appeals to human intuition of time. To have service log files use the wall clock instead of the user clock in their log files, the attribute //OpenSplice/Domain/UserClockService[@reporting] can be set to false. Solution: The clock that is used to report time stamps in service log files is chosen based on the //OpenSplice/Domain/UserClockService[@reporting] settings. |
OSPL-12820 |
ddsi may have problems handling big data samples DDSI breaks up samples into fragments to be able to send them over the network effectively. In case (part of) the sample is 'lost' when sending over the network, DDSI is able to resend the fragments that have been lost or alternatively the complete sample (so all fragments). For large samples, the chances of at least one fragment being lost is relatively high and given the size of the complete sample, resending it completely is very costly, let alone the chances of 'loosing' part of the resent sample again. The ddsi protocol for handling fragments does not allow the receiver to Ack individual fragments (it can only NACK missing fragments), so it is hard to notify the sender when it is sending too fast. Solution: By creatively playing with the various freedoms that the ddsi protocol allows, we can prevent ddsi from retransmitting the whole message if it is not Acked in time. This should improve throughput and overall stability of the connection. |
OSPL-13985 |
Add option to durability to prevent injecting persistent data more than once when using master priorities. When using the durability service uses master selection based on priorities (masterPriority attribute set) then it may occur that the persistent data is injected more than once by different durability services. This happens when one durability service has taken mastership on a namespace because it has the highest properties for that namespace but later another durability service is started which has better properties for that namespace and will take over the mastership of that namespace and will inject it's persistent data again. When using master selection based on priorities the properties that determine which durability service will become master are the priority and next the quality of the namespace an finally the system-id of the node. When there are several potential aligners in the system and there is no hierarchy among them then the can be configured with equal master priorities. However it then still not guaranteed that only one durability service will inject it's persistent data because now the selection is made on the quality of the namespace. I case that is not desirable an option is needed to select the master not on the quality of the namespace but on the durability service that was started first. Solution: For this purpose the attribute 'masterSelectBy' is added to the 'Policy' configuration element associated with a particular namespace. The possible values of this attribute are 'Quality' or 'FirstStarted'. When set to 'Quality', which is the default, the master selection criterium will be based on the quality of the persistent data. When set to 'FirstStarted' the master selection criterium will be based on the startup time of the durability service. |
OSPL-13989 |
Add the option to have a durability service not to advertise it is an aligner when using masterPriority=0. When configuring a namespace with masterPriority=0 and aligner=true then the durability service will act as a master for this namespace until there is no globally selected master available but it will never become master for other nodes. However it will still advertise itself as an aligner for that namespace. As an aligner for other nodes it may contain durable data that the durability service that has become the master still needs to retrieve to be able to distribute this data over the system. However is is not always desired that the master retrieves this data from the nodes that have configured masterPriority=0. Especially in systems with a large number of nodes this may reduce the initial alignment load on the system. Solution: To provide this option the possible values or the aligner attribute of the Policy element associated with a particular namespace has been extended with the value 'Local'. Thus the valid values of the aligner attribute are now True, False or Local. Note that 'Local' may only be used in combination with masterPriority=0. |
OSPL-14022 |
Set the default rate for a DLite service to publish its metrics to every 10 seconds Any DLite service can periodically publish metrics that can be received by monitoring services to assess the health of the DLite service. The default frequency was to publish a metric every 1s. In practice this is not needed and could potentially lead to unnecessary load. A 10s period seems to be a more sensible default. The default value can be overriden by //OpenSplice/DLite/Settings/metrics_interval. Solution: A DLite service now publishes metrics every 10s unless the value is overriden by //OpenSplice/DLite/Settings/metrics_interval. |
OSPL-13345 / 00020318 |
Simulink - override partition from Publisher and Subscriber blocks. In previous Vortex OpenSplice releases, the partition name was only settable through the QoS Profile that is selected in the Simulink blocks. Solution: Partition overrides are now possible on Publisher and Subscriber blocks via block parameters. |
OSPL-13407 / 00020374 |
When a user clock is used, the services use the user clock instead of the wall clock in their log files When a user clock is configured, the services use the user clock instead of the wall clock in their log files. Sometimes customers want to use the wall clock instead of the user clock in their logs, because the wall clock better appeals to human intuition of time. To have service log files use the wall clock instead of the user clock in their log files, the attribute //OpenSplice/Domain/UserClock[@reporting] can be set to false. Solution: The clock that is used to report time stamps in service log files is chosen based on the //OpenSplice/Domain/UserClock[@reporting] settings. |
OSPL-14104 |
Memory leakage of resend data on writer deletion Data published by a writer that could not be delivered because of temporary unavailability of peers is maintained by the writer in its history cache until it can be delivered. Although unlikely that resend data exists at writer deletion it will leak because the writer destructor fails to delete the data. Solution: Implemented resend data deletion in the writer destructor. |
OSPL-14105 |
Support DDSI discovery caused by multicast discovery on one side only The DDSI service sends participant discovery messages to the SPDP multicast address (if enabled), any addresses configured as peers, and any unicast addresses it added at run-time based on discovery. That third group did not include addresses of peers that advertise themselves via multicast while the local node has SPDP multicast disabled (a very non-standard but sometimes quite useful configuration). The result is that discovery could occur in only one direction and that some temporary, asymmetrical disconnections could not be recovered from. Solution: It now adds the unicast addresses of discovered peers when SPDP multicast is disabled. This ensures bidirectional discovery and recovery from all temporary asymmetrical disconnections. |
OSPL-13502 / 00020501 |
Python: Access the version number of OpenSplice in the Python dds module. In the previous releases of OpenSplice, it was not possible to access the OpenSplice version number in the Python dds module. Solution: It is now possible to get the OpenSplice version number in the Python dds module using the "__version__" attribute. Please note that if the Python code is recompiled with a different OpenSplice version, the Python dds package will still show the OpenSplice version number it originally came with. |
OSPL-14064 / 00020913 |
Service crash during startup when configuration becomes unavailable When service configuration is removed from the configuration file after starting spliced but before spliced has started the service, the service may crash due to missing configuration elements. Solution: Some extra checks and error report were added to allow a graceful termination of the service instead of a crash |
Report ID. | Description |
---|---|
OSPL-13920 / 00020795 |
Node.js DCPS : Errors when importing IDL for topics using typedef references In Node.js DCPS, an importIDL api is provided to import topic types defined in IDL. The importIDL api generates xml using idlpp, then processes the xml. If the IDL included typedef references to other typedef references, the end user would see errors. The processing of the idlpp generated xml did not handle typedef references to other typedefs. Solution: The OSPL Node.js DCPS code has been fixed to handle the cases where topics defined in IDL include typedef references to other typedefs. |
OSPL-13943 / 20692 |
Durability alignment is not consistent among several nodes when using a REPLACE policy. When the durability service performs a REPLACE alignment policy the corresponding instances based on the timestamp of the alignment are first wiped from the transient store before the aligned samples are injected. When in the meantime a dispose of a DCPSPublication corresponding to some of the aligned data is handled by the spliced daemon then it may occur that these instances are placed on a purge list before the aligned samples are injected. In this case the injection of the samples will incorrectly not remove these instances from the purge list. Solution: An instance is always removed from the empty purge list when a sample is injected and the instance becomes not empty. |
OSPL-13909 |
Durability should wait for the presents of remote durability protocol readers when using ddsi. When the ddsi service is being used and in case the durability service detects a fellow durability service the durability service should wait with sending messages to that fellow until it has detected the remote durability readers. Due to some configuration parameter changes this function has mistakenly been disabled. Solution: The check for the presents of remote durability readers is enabled when the ddsi service is used as networking service. |
OSPL-13724 / 00020481 |
The Vortex.idlImportSlWithIncludePath function call of Simulink Vortex DDS blockset was causing error on Windows platform. On the Windows platform, the Vortex.idlImportSlWithIncludePath function call of Simulink Vortex DDS blockset was causing error for passing the includedirpaths argument. It is because the function was passing the arguments in the wrong order to the IDLPP tool. Solution: The bug is now fixed. The Vortex.idlImportSlWithIncludePath function has been updated to pass the arguments in the correct order to the IDLPP tool. |
OSPL-12485 |
Possible incomplete transaction when aligned by durability It was possible that a transaction was incomplete when aligned by durability as all transactional samples were treated as EOT. All transactional samples were compared as if they were EOTs which could lead to transactional samples being discarded as duplicates and not aligned Solution: Made sure only EOT messages are compared |
OSPL-12877 |
Alignment may stall when a new local group is created while a merge request to acquire the data via a merge for the same group is being scheduled When a durability service learns about a partition/topic combination, it may have to acquire data for this group by sending a request for samples to its master. When at the same time a merge conflict with the fellow is being handled, this may also lead to the sending of a request for samples to the same fellow. Both paths are for performance reasons decoupled, and so there is a window of opportunity that may lead to two identical requests for the same data to the same fellow. Since the requests are identical only one of them is answered. The remaining one never gets answered, with may potentially stalls conflict resolution. Solution: The requests are distinguished so that they are never identical |
OSPL-13307 / 00019125 |
When running mmstat -M some of the numbers created are incorrect The variables which are created by mmstat that represent a difference are output as unsigned long int. This means that negative numbers are incorrectly output. Solution: Changed the data type of variables that represent a difference from unsigned long int to signed long int to avoid incorrect output in mmstat -M. |
OSPL-13532 |
Stalling alignment when an asymmetrical disconnect occurs during the durability handshake When durability services discover each the must complete a handshake before data alignment can start. The handshake consists of a several stages: # Whenever a durability service discover another durability service it pushes a so-called Capability message to the discovered durability service. A precondition for this to happen is that the Capability reader of the remote durability service must have been discovered, otherwise the Capability message may get lost. These Capability messages are essential to detect, and recover from, asymmetric disconnects. # Once a Capability message has been received from a remote durability service it is possible to request its namespaces by sending a so-called nameSpacesRequest message (ofcourse, after having discovered the reader for this message on the remote durability service). This should trigger the remote durability service to send its namespaces, after which the handshake is completed There are two problems with the handshake. First of all, when a durability service sends its request for namespaces to the remote durability service, there is no guarantee that the remote durability service has discovered its namespaces reader at the time the namespaces are being published, so they can get lost. Secondly, and more likely, when an asymmetric disconnect occurs while establishing the handshake, it is not possible anymore to detect that an asymmetric disconnect has occurred , and therefore it is not possible anymore to recover from this situation. This will effectively lead to a situation where the handshake is not completed, and therefore alignment is stalled. Solution: There are two soutions ingeredients # When a Capability is published to a remote durability service ALL its relevant readers must have discovered iso only the Capability reader. # To resolve the stalling handshake due to asymmetric disconnects occurring during the handshake, Capability message and nameSpacesRequest message are being republished when the handshake takes too long to complete. This can be controlled using two environment variables ## The environment variable OSPL_DURABILITY_CAPABILITY_RECEIVED_PERIOD specifies the time (in seconds) after which to republish a capability. The default is 3.0 seconds. ## The environment variable OSPL_DURABILITY_NAMESPACES_RECEIVED_PERIOD specifies the time after which to resend a nameSpacesRequest. The default is 30.0 seconds |
OSPL-13692 / 00020677 |
Networking throttling may slow down communication too long. When a receiver experiences high packet loss the backlog will increase which is communicated to the sender. In that case the sender will apply throttling to reduce the load on the receiver, However throttling is also applied to the resending of the lost packets. This may cause that the backlog at the receiver decreases at a low rate causing the throttling to applied longer than necessary. Solution: A parameter (ResendThrottleLimit) is added which sets the lower throttling limit for resending lost packets. Further when the sender detects that there are gaps in the received acknowledgements resends are performed earlier. |
OSPL-13698 |
When the node causing a master conflict is disconnected before resolving the master conflict may remain unresolved. When using legacy master selection and a master conflict is detected because another node has selected a different master then the current master is directly set to pending. When before resolving the master conflict the node that caused the master conflict has been disconnected the master conflict could be dropped although the master conflict still exists and because the master is set to pending no new master conflict is raised. Solution: A master conflict is always resolved independent of the node that has caused the master conflict has been disconnected and has been removed from the durability administration. |
OSPL-13705 / 00020691 |
Provide option to the RT networking service to optimize durability traffic by using point-to-point communication. For the RT networking service, the durability service is just another application. Protocol messages sent by the durability service will therefore be sent to all the nodes in the system. The protocol messages sent by the durability service are either addressed to all, a subset or to only one fellow durability service. To limit the networking load caused by the durability service it would be beneficial when the networking service has some knowledge of the durability protocol and sent durability messages that are addressed to one fellow to be sent point-to-point. This requires that the capability to send messages point-to-point is added to the RT networking service. Solution: Support for point-to-point communication for durability message addressed to one fellow added. |
OSPL-13748 / 00020708 |
The RT networking service can run out of buffers when receive socket is overloaded. To limit the chance of packet loss to occur in the receive socket the networking receive thread tries to read as much packets from the receive socket before processing these packets further. However when the receive socket remains full the number of received packets that are waiting to be processes is increasing which may cause that the networking service will run out of buffers. Solution: When reading packets from the receive socket the size it is checked if the number of packets waiting to be processed does not exceed a threshold. When the threshold is reached the networking receive thread will first process some waiting packets before attending the receive socket. |
OSPL-13753 / 00020714 |
When installing a Visual Studio 2005 version silently a popup window appears and stops the installation The installations now ensure Visual Studio redistributables will not force a reboot of Windows before the main installer has completed by using an optional parameter when running the redistributable . This has created a problem for Visual Studio 2005 versions as this parameter is illegal and an error message is produced. Solution: An additional page has been included in the installer to allow users to not install the Visual Studio redistributable. This option can also be used when installing silently and allows a customer to skip the redistributable which creates the error condition. |
OSPL-13756 |
Provide the option to have the RT networking service perform the distribution of the builtin topics. When using the RT networking service the durability service will be responsible for alignment of the builtin topic which is not the case when the DDSI service is used. In a large system the number of builtin topics can become very large. When the networking service is made responsible for aligning the builtin topics only the own builtin topics of a node have been aligned when two nodes detected each other. Especially when a disconnect/reconnect occurs it will reduce the number of builtin topics that have to be aligned. Solution: The ability to align the builtin topics by RT networking has be added and a configuration parameter ManageBuiltinTopics has been added by which this ability can be enabled. Note to maintain similarity with the DDSI service this applies to: DCPSParticipant, DCPSPublication, DCPSSubscription, DCSPTopic and the CM related builtin topics. |
OSPL-13771 / 00020719 |
Python API: 'Out Of Resources' exceptions when using conditions and shared memory. A memory leak was introduced in 6.10.4p1. In the python class Condition, dealloc was removed, resulting in improper cleanup. This change was introduced as a fix for OSPL-13503 Cleanup error: dds_condition_delete: Bad parameter Not a proper condition. Solution: The change to remove the Condition dealloc was for a minor logging issue. This OSPL-13503 change was rolled back in order to fix the more serious Out of Resources exceptions. With this rollback, extra error messages may be logged. The memory leak for Condition is fixed. |
OSPL-13773 |
The durability service may send an alignment request to a not confirmed master. When during the master selection a master is proposed but that master is not yet confirmed and in parallel the need to align the data of a topic/partition is triggered then it may occur that an alignment request is sent to this not yet confirmed master which may not become the actual selected master. Solution: Delay during initial alignment requesting alignment of data until master selection has finished and a the master is confirmed. |
OSPL-13781 |
Allow setting the master selection protocol on each durability namespace independently. Either legacy master selection of master selection based on master priorities can be configured for the durability namespaces. When master selection based on master priority it should be configured on all the namespaces. However it should be allowed to configure the master selection protocol on each namespace independently. Solution: The global setting of the master selection protocol is removed and the master selection protocol configured for each namespace is applied when selecting a master for that namespace. |
OSPL-13784 / 00020725 |
The RT networking service the synchronization on the first expected fragment from a newly detected fellow could fail. When the networking service receives a first packet from an other node it has to determine the first sequence number of the packet that is both used by the sending node and the receiving node as the starting sequence number of the reliable communication. This first sequence number is determined either from the offset present in the first received packet or on the expected packet number indicated by the sync message when SyncMessageExchange has been enabled. The sender will then resent the packets from that agreed sequence number until the sequence number of the packet already received. In this case packets with lower sequence numbers than the sequence number of the first receive packet should be accepted. However this may fail when the first sync message is lost which may cause that packets are rejected by the receive but are already acknowledged. In that case the received will not receive the expected packet . Solution: When waiting for the sync message which sets the expected sequence number packet received with a sequence number lower than the sequence number of the first received packet are accepted and placed on the out-of-order list. |
OSPL-13791 / 00020706 |
Potential memory leak in Java5 DataReader The Java5 DataReader offers two different modes of reading data: it either returns an Iterator holding all requested samples, or you preallocate an ArrayList and pass that as input parameter to your read/take call. The latter is more efficient if you want to benefit from recycling samples allocated in the previous invocation of your read/take calls. For this purpose, the DataReader keeps track of a MAP containing for each previous ArrayLists all the relevant recyclable intermediate objects. However, if you keep feeding new ArrayList objects to each subsequent read/take call, the MAP will grow indefinitely and leak away all your previous data. Although invking the preallocated read/take calls this way is against its intended usage and design, some examples are doing exactly that. Solution: Examples have been modified not to feed new ArrayList objects to every subsequent read/tale call. Also the MAP that keeps track of all previous ArrayLists and their corresponding intermediate objects has been replaced with a one-place buffer. That means you can still benefit from recycling intermediate data if you use the same ArrayList over and over, but it will garbage collect anything related to any prior ArrayList you passed to the read/take calls. |
OSPL-13795 / 00020698 |
Issues during termination of spliced when configuration specifies thread attributes for heartbeat-manager When the configuration file specifies attributes such as stack size or scheduling class for the heartbeat-manager in spliced (//OpenSplice/Domain/Daemon/Heartbeat), termination fails and triggers an error report "Failed to join thread (null):0x0 (os_resultSuccess)" Solution: The code was changed to cover a specific path where after stopping the thread an invalid return code was propagated leading to failed termination. |
OSPL-13797 |
Unnecessary alignment may occur when a node with a namespace with aligner=false (temporarily) chooses a different master for this namespace. This unnecessarily increases network load. When a node with aligner=false for a namespace enters a network, this node starts looking for an aligner. If there are potentially multiple aligners but not all of them have been discovered yet, then it could happen that this node chooses a different master for the namespace than somebody else. When the nodes that chose a different master for the namespace detect each other, then a master conflict is generated. Resolving this master conflict leads to alignment. Although functionally there is nothing wrong, the unfortunate situation in this scenario is that the alignment for nodes with alignment=false is not necessary, because by definition of aligner=false this node will not provide any alignment data to the master (whichever one is chosen). Still the master bumps its state, and causes all slaves to align from the master again. These alignments are superfluous. Solution: The situation where are node with aligner=false has (temporarily) chosen a different master is not considered a valid reason to start alignment. |
OSPL-13812 |
Trying to unregister a non-existent instance leads to an entry in ospl-error. This is incorrrect. Trying to unregister a non-existent instance is a legitimate application action that should return PRECONDITION_NOT_MET. However, as a side effect also an error message would appear in ospl-error.log. Solution: The error message is not generated anymore when a non-existent instance gets unregistered. |
OSPL-13844 |
Spliced will crash during shutdown if builtin topics have been disabled. There is a bug in the spliced that causes it to crash during shutdown when you configured OpenSplice not to communicate the builtin topics. This was caused by spliced forgetting to set the Writers for the builtin topics to NULL in that case, which during shutdown would result in the spliced attempting to release dangling random pointers. Solution: The writers for the builtin topics are now properly set to NULL when you disabled the builtin topics, and therefore spliced will not attempt to clean them up during shutdown. |
OSPL-13868 |
Configuration files for NetworkPartitions example were incorrect The example configuration files include a ddsi2 service and not ddsie2 so extra additional values are not visible in configuration tool and would not be used in OpenSplice. Additionally a number of the elements are incorrectly cased. Solution: Updated Example files have been included. |
OSPL-13888 |
The durability service leaks memory when handling a received namespace message. When a namespace message from a fellow is received and that namespace message is a duplicate of an earlier received namespace message allocated namespace leaks. Solution: The duplicate namespace is freed. |
OSPL-13892 |
Potential backlog in the processing of builtin topics by spliced The spliced is responsible for processing incoming builtin topic samples. This processing is needed to for example modify the local notion of the liveliness of remote entities and the instances they have written. Having the wrong notion of the liveliness of a remote entity could result in instances being marked ALIVE, while they should have been marked NOT_ALIVE or vice-versa. Also, the failure to notice the deletion of a remote entity could result in extended waiting times in case of for example the synchronous write, where a writer is still waiting for acknowledgments of a Reader that already left the system. Due to a bug in the spliced code, the spliced could under certain conditions postpone processing of builtin topics for potentially long time intervals, resulting in incorrect liveliness representations during this interval, which in turn might cause extended waiting times in case of a synchronous write call. Solution: The spliced now no longer postpones the processing of builtin topics, causing the representation of the liveliness of entities and instances to be up to date, and avoiding unnecessary waiting times in the synchronous write call for readers that have already been deleted. |
OSPL-13923 |
MATLAB Query and Reader failure with waitsetTimeout() The MATLB Vortex.Query class would throw on calls to take() or read() if a non-zero value had previously been provided to waitsetTimeout(). BAD_PARAMTER messages would be written to ospl-error.log. In a similar situation, a Vortex.Reader class instance would appear to succeed, but ospl-error.log would still contain BAD_PARAMETER messages, and a DDS entity would be leaked with each call to read() or take() Solution: The problems have been fixed. Uninstall the currently installed Vortex_DDS_MATLAB_API toolbox and install the new toolbox distributed with this release. (The toolbox is located under tools/matlab in the OpenSplice installation directory.) |
OSPL-13929 / 00020814 |
Alignment of DCPSPublication may cause that instances that were explicitly unregistered and disposed are not purged and leak from shared memory. When detecting a disconnect of a node the instances written by writers on that disconnected node are unregistered. When the same node reconnects then alignment of DCPSPublication will indicate which writer are still alive. These DCPSPublication will then be used to update of the liveliness of the corresponding instances. However explicitly unregistered instances are also updated which causes that they are removed from the purge list which results in a memory leak. Solution: When handling the re-alignment of a DCPSPublication the corresponding instances that were explicitly unregistered are ignored. |
OSPL-13931 |
Potential alignment issue for unions in generic copy routines for C The generic copy routines for the C API may potentially misalign union attributes, causing the fields following the union to contain bogus values. Solution: The algorithm to determine proper alignment for unions has been corrected. |
OSPL-13937 |
Enable or disable tracing for Dlite In some situtions users want to disable tracing in production environments, and enable tracing in testing environment. So far, there has not been an easy way other than commenting out the tracing section in the configuration. This is cumbersome. Solution: An attribute //OpenSplice/Dlite/Tracing[@enabled] is added that can be used to enable/disable tracing for Dlite. |
OSPL-13803 |
Possible crash at termination of NodeJS with DDS Security The DDS Security implementation relies on a certain termination path to cleanup all it's resources, part of it dependent on an exit handler. This exit handler does not run reliably at the same moment, eg. before or after certain threads are joined, depending on context, such as a single-process deployment running in NodeJS. Solution: The cleanup was changed to work regardless of the exact moment when the exit handler is executed. |
OSPL-13799 / 00020745 |
Generate a logging message for dispose_all_event The invocation of the dispose_all() function on a topic is an important event, that should appear in the ospl-info.log file. Solution: A message is written into the ospl-info.log by the node that invokes the dispose_all() function. Note: although all other nodes respond by also disposing their corresponding topic data, they don't mention this event in their ospl-info.log. |
OSPL-13870 |
Add a parameter to RT networking to allow the independent setting of the acknowledgement interval. A reliable channel uses acknowledgement messages to notify the sender that a packet has been successfully received. To limit the rate at which acknowledgements are sent the acknowledgements are accumulates during the acknowledgement interval. Currently the acknowledgement interval is set to the configured resolution interval of the channel. However it could be useful to have the ability have an independent parameter which specifies the acknowledgement interval. Solution: The AckResolution parameter has been added to the RT networking configuration. When set it will determine the interval at which acknowledgements are sent. When not set the acknowledgement interval is set to the resolution of the reliable channel. |
OSPL-13871 |
Add an configuration parameter to RT networking to disable the sync exchange at initial detection of a node. When the SyncMessageExchange is disabled reliable communication with another node is started when receiving the first acknowledge from that node. When the SyncMessageExchange is enabled reliable communication will also start when receiving a first message from a node. The sync message will communicate the sequence number from which reliable communication is provided. However this may cause a very high backlog of packets that have to be resend to the newly detected node especially when initial latencies are large. Therefore an option should be provided to enable the SyncMessageExchange only on receiving the first acknowledgement which will reduce this initial backlog or resends to occur. Solution: A mode attribute is added to the SyncMessageExchange parameter which indicates if the reliable synchronization should occur on the initial received packer or the first received acknowledgement. |
OSPL-13875 |
Add option to RT networking to disable the initial sequence number offset. To establish reliable communication between a sender and receiver both have to agree on the initial packet sequence number from which reliable communication is established. This sequence number is based on the first sequence number that is acknowledged minus a small offset which in included in each packet sent. The initial sequence number is than the first acknowledged sequence number minus the offset. This offset then determines the number of packets that have to be resend immediately. To reduce this initial backlog an configuration parameter has to be added to disable this offset. Solution: The configuration parameter OffsetEnabled is added to allow disabling the offset calculation. |
OSPL-13939 / 00020822 |
Looser type-checks for the Python API It is no longer required to use python built-ins as long as the cast is defined. For string types only support for encoding and length determination is needed. For sequences and arrays iteration and length determination are the only requirements. Solution: Loosened type requirements on integers, bools, strings and floats. |
Report ID. | Description |
---|---|
OSPL-13694 / 00020679 |
Incorrect behavior on the members of 'dds_sample_info' bus of Simulink Vortex DDS Reader block. The 'dds_sample_info_t' array on the Vortex DDS Reader block for Simulink was not initialized properly. Therefore, on the generated code from a Simulink model, the members of the ' dds_sample_info' bus were incorrectly assigned to a value when there was no data available. Solution: The bug is now fixed. The 'dds_sample_info_t' array of the Vortex DDS Reader block is now initialized correctly. |
OSPL-13707 |
The function wait_for_historical_data may fail for reliable transient-local topic. When ddsi has received all the history data from a transient_local writer it will set the associated dds readers complete which signals the wait_for_historical_data. In a particular situation when there are several readers for the same transient_local topic created and the transient_local writer sends a GAP message which makes the history of one of these readers complete it may occur that the corresponding dds reader is not notified that it has become complete causing the wait_for_historical_data function to time out. When handling the GAP message the sequence number contained in the GAP message is used to evaluate if all history data has been received. However it should use the next expected sequence number which is associated with the reader history. Solution: When a ddsi transient_local reader receives a GAP message and the history cache associated with that reader is updated the next expected sequence number is used to determine if all the history data has been received. |
Report ID. | Description |
---|---|
OSPL-13503 / 00020504 |
Python API cleanup error on condition delete For Python API applications, error reports saying "Bad parameter Not a proper condition" in the context of "dds_condition_delete" were getting logged in the ospl-error.log file during the termination of the applications. Solution: The bug is fixed and the error will not occur anymore. |
OSPL-13672 / 00020656 |
RnR unable to record all historical data A regression issue caused a problem in the RnR service when recording historical transient or persistent data. The issue would result in no historical data at all or only an incomplete set getting recorded Solution: The bug was fixed and a regression test-case extended to catch the issue if it occurs again in the same or similar circumstances |
OSPL-13259 / 00020193 |
A race condition in the waitset implementation may cause an OpenSplice daemon to crash. In the implementation of the waitset there is a condition in which the waitset data structure is not properly protected. This may cause that the waitset trigger and the waitset wait may access the waitset data structure concurrently causing the crash. Solution: Access to the waitset data structure is protected in all code paths to allow waitset to be thread-safe. |
OSPL-13333 / 00020275 |
When running out-of-resources the networking service may crash when using compression. When compression is enabled in the networking service will compress several messages in one larger compressed message. The received compressed message is decompressed and placed in a buffer before deserializing the contained messages into shared memory. When during deserialization of a message an out-of-resources condition occurs because there is not enough shared memory available anymore this error condition is not properly handled causing an incorrect memory access to occur. If deserialization fails, the pointer used to scroll over the messages in the decompression buffer will not be correctly set to the next message in the buffer. Solution: When the deserialization of a received message fails the pointer used to scroll over the messages contained in the decompression buffer is correctly set to the next message. Further when the deserialization of the message fails because of an out-of-resources condition a fatal error is raised which terminates the networking service. |
OSPL-13504 / 00020506 |
Segfault on exit when using Python API When using OpenSplice Python API in the applications, in some cases it was possible that a Segmentation Fault could occur during the termination of the applications. It was happening when the Python API was trying to free the allocated memory for the condition. Solution: The problem is now fixed. |
OSPL-13549 / 00020536 |
Memory leak in queried take There exist several memory leaks at places where original terms in a query are replaced with their disjuncted counterparts. In those places, the original term then leaks away. Solution: Modifications have been made to properly clean up these original terms. |
OSPL-13561 / 00020546 |
When SyncMessageExchange is enabled in the RT networking service it may occur that reliable communication does not start correctly. The send part of a reliable networking channel detects a remote node it has to put the packets that the remote node is expecting on a pending resent list which is used to resent packets that are not acknowledged by the other node. However the remote node may have already received an packet with a earlier sequence number which is acknowledged. When the SyncMessageExchange option is enabled it may occur that the pending resent list is already populated with packets to resent but when the first acknowledgement indicates an earlier sequence number it could occur that the packets the remote node is expecting are not put on the resent list causing the other node to wait for the missing packets. Solution: When receiving an acknowledgement contains a sequence number that was before the first sequence number contains in the resent list for that node then populate the resent list with the missing packets which are still available. |
OSPL-13574 / 00020557 |
Simulink Vortex DDS Block Set did not support "Rapid Accelerator" mode simulation. In previous Vortex OpenSplice releases, the Simulink Vortex DDS Block Set did not support "Rapid Accelerator" mode simulation. Solution: The Simulink Vortex DDS Block Set now supports "Rapid Accelerator" mode simulation. |
OSPL-13579 / 00020534 |
Python bug fix for the serialization problem for some structs when using dynamically generated Python classes. Previously, in the Python DDS API, the packing format for some structs was generated incorrectly when using dynamically generated Python classes. This problem did not affect Python classes generated by IDLPP. Solution: The bug is now fixed. |
OSPL-13598 |
In the durability service change and add trace messages to level fine. To improve analysis of durability trace files when the log level was set to fine a number of trace message which are logged at a higher level have to be enabled at level fine. Solution: A number of trace message have be enabled at level fine. Further some extra trace message are added to improve the analysis of the durability trace file. |
OSPL-13610 |
Matlab read, take, query-based read and query-based take functions return identical values from the Reader's cache. While reading N samples ( N >1 ) from Reader's cache the Matlab functions read, take, query-based read and query-based take were processing the data vector incorrectly and returning only the first entry from Reader's cache. Therefore, the output samples were identical after reading samples from Reader's cache. Solution: The Matlab read, take, query-based read and query-based take functions have been updated to return all entries from Reader's cache. |
OSPL-13614 |
The durability service fails to detect a master conflict when this occurs during the initial phase. When the durability starts it first determines a master for each configured namespace and then it identifies the topic-partitions that are locally known before it starts the initial alignment. When during this phase a master conflict occurs, e.g. another node has selected another master, then it false to trigger a master conflict which will redo the master selection to have a single master in the system for each namespace. Solution: Enable the detection of a master conflict after having selected a master. |
OSPL-13621 / 00020601 |
A reconnect may cause that the durability service incorrectly aligns implicitly disposed samples. When a node detects that another node has become disconnected the corresponding instances written by writers on the disconnected node are implicitly unregistered. When the associated writer QoS has autodispose_unregister_instances enabled then the implicit unregister also causes an implicit dispose of these instances. Note that these implicit generate message are related to the local discovery view of the node and should not be aligned between systems. These implicit messages will be purged from the transient store after the service_cleanup_delay expires. However when the durability service is performing an alignment the purging of the transient store is suppressed until the durability service declares the transient store complete again. When a reconnect occurs the durability service may perform an alignment which incorrectly may contain these implicit dispose messages. Solution: When collection alignment data from the transient store implicit messages are filtered out. |
OSPL-13622 / 00020602 |
Tuner: In certain uses cases, adding a query to a reader has no effect on incoming samples. A reader can be created with an optional query. A user would expect that if a query is defined, the query would be used when reading data. However, in certain cases, the query is not applied to incoming samples. Solution: Workaround: The read or take using a query is supported, ONLY IF you access the reader query from the Participant view. In the Participant view, you can see the reader, and underneath, the query for the reader. The query must be used to read data instead of the parent reader, in order for the query to be applied. Note: This feature in the participant view was broken in 2016 with the implementation of type evolutions (OSPL-7242). It has been fixed as a part of this ticket. |
OSPL-13632 |
Isocpp2 backend of idlpp represents true/false literals for boolean in capitals When using a boolean literal in your IDL to represent the value for a constant variable, the ISOCPP2 language backend will illegally represent these literals in uppercase (like it is done for classic C/C++, where an IDL boolean is actually mapped to an unsigned char), but for ISOCPP2 it will be mapped to C++ type bool, that can only hold the C++ keywords "true" and "false". Solution: The boolean IDL literals are now represented in lowercase for the ISOCPP2 target. |
OSPL-13635 |
When the durability service uses legacy master selection then a master conflict may not be resolved when a fellow disconnects before resolving the conflict. In case the durability service uses legacy master selection and when a master conflict is detected then the current master of the namespace is directly marked pending to prevent a duplicate master conflict to be raised. However when the fellow that caused the master conflict is disconnected before resolving the master conflict the master conflict may be discarded. as not relevant anymore. However this may cause that the master conflict still exists with other fellows which were detected but ignored because the master was directly set to pending. Solution: When detecting a master conflict in a namespace the master of the namespace is not directly set to pending. The master is set to pending at the moment new master selection is performed. |
OSPL-13641 / 00020630 |
Potential mis-alignment in C# union may cause data corruption In some cases, the database representation of a union in C# might suffer from mis-alignment, caused by an incorrect Packing instruction that was generated by idlpp as part of this C# database representation. This mis-alignment might cause data corruption for OpenSplice services and other applications. Solution: The incorrect Packing instruction is removed. |
OSPL-13668 / 00020651 |
Python DCPS API: When explicitly assigning an empty list to a sequence in Python, the message read will fail. When explicitly assigning an empty list to a sequence in Python, the message read fails. As a result, to clear a sequence, the whole message needs to be cleared and refilled with data. Solution: Fixed. A check is now made in _deserialize_seq to ensure that there is no exception thrown when reading a message with an empty sequence. |
OSPL-13669 |
Low priority durability master selected when using master priority selection. When the durability service is configured to use priority master selection than the instance with the highest priority should become the master. However depending on the order in which the priority of the fellows is evaluated it could occur that an instance with a lower priority is selected as master which could cause that master selection is not consistent between the durability instances. Solution: When determining the fellow with the highest priority and walking over the known fellows keep a temporal reference to the possible fellow with the highest priority. |
OSPL-13218 |
Better support for multi-user Windows environments Some features of OpenSplice process-management are not supported on Windows when multiple user-accounts are involved in running OpenSplice services and applications in shared-memory mode. Solution: Instead of refusing applications to connect to shared-memory, the process-management features can now be disabled through configuration options. For more information see Deployment Guide, section 13.2.11.3 shmMonitor, regarding registerCallback and registerProcess settings. |
OSPL-13631 / 00020608 |
Unnecessary information was removed from service log. Service log file size was increasing quickly. Solution: Unnecessary topic type (MetaData) information has been removed from New-BuiltinTopic log record to filter unnecessary data and reduce log size. |
Report ID. | Description |
---|---|
OSPL-13488 / 00020481 |
The IDL feature of using file hierarchy supported by 'idlpp -I' option was not implemented in Simulink. The IDL feature of using file hierarchy supported by 'idlpp -I' option was not implemented in the IDL import function of the Simulink Vortex DDS block plugin. Solution: The 'idlpp -I' feature is now implemented in Simulink. A new function 'idlImportSlWithIncludePath' has been added to the Simulink Vortex DDS block plugin to support the IDL feature of using file hierarchy. In this function, the positional argument 'INCLUDEDIRPATHS' can be used to pass a cell array with the values of the include paths for IDL files. |
OSPL-13501 / 00020505 |
Network bridge does not operate correctly when bridging native RT networking services only. The networking bridge subscribes to the builtin topics to learn the topology of the network. It waits for subscription match status for the corresponding readers. This is to assure that the ddsi service which contains writers for the builtin topics are available. However when only native RT networking services are present this does not work because the native RT networking services do not have writers for the builtin topics. Solution: When only native RT networking services are being bridged then the check on the subscription match status of the builtin topic readers is not performed. |
OSPL-13377 / 00020342 |
An exception is thrown when handling events such as requested_deadline_missed While handling events such as requested_deadline_missed if the handle of the entity than caused the event to be triggered becomes invalid an exception will be thrown Solution: Update the handling of events to return an error case in case that the handle of the entity that triggered the event becomes invalid instead of throwing an exception |
OSPL-13400 / 00020352 |
Incompatibility issue between Python API and C++ API when using KEEP_ALL history QoS policy Python API (C99 API) and C++ API use different values for depth when using KEEP_ALL QoS policy therefor an error occurs when comparing the QoS policies used in both API's Solution: Do not compare the depth values when using KEEP_ALL QoS policy in all API's |
OSPL-13470 |
Potential memory leak in deallocation of sequence buffer for C/C99 The C/C99 representation of your IDL model suffers from a memory leak in the following case: 1, You have an attribute that is a typedef to a sequence. 2. The element type of this sequence contains an indirection (i.e. directly or indirectly holds a string or another sequence). The problem is that the deallocator for the sequence buffer, as it is currently generated by idlpp, is missing a deallocator for the indirection(s) in the individual elements, which therefore leak away. Solution: A proper deallocator function for each individual sequence element is now correctly generated and invoked by the deallocator of the sequence. |
OSPL-13484 / 00020479 |
Durability master selection may fail when a potential namespace master is started and delayed alignment is used. When delayed alignment is enabled for a namespace and a node has selected a master for a that namespace and another node is started or restarted which is also a potential master for that namespace it could trigger a new master selection for that namespace even if the original master has already published data. Timing issues could cause that the newly triggered master selection fails to find a new master. This may occur when the newly started node has not reached the correct state when master selection is performed. Solution: The durability service should update the quality of a namespace when receiving the namespace information from the master node. This prevents that master selection is restarted when delayed alignment is used and there is already data published. Further delayed alignment should only trigger a new master selection when then the potential new master has reached to correct state. |
OSPL-13493 / 00020486 |
Potential data loss during alignment when using the REPLACE merge policiy. A bug was discovered in an algorithm responsible for the conversion of WallClock time to Elapsed Time in the durability service, that could result the disappearance of data in some nodes that are being aligned using the REPLACE merge policy. Basically the conversion error might represent a time that was x seconds in the past as being x seconds in the future. Given an arbitrary time t, if during the window [ t-x, t+x ] data is received through the live path, this live data might be purged as part of the old data set that is to be replaced by data set that is received as part of the merge. Solution: The conversion error has been corrected now. |
OSPL-13500 / 00020502 |
When then RT networking service is restarted communication may not be restored. When the RT networking service is configured with the restart failure action then when the networking service is restarted because the occurrence of an exception then it may occur that the communication is not restored. This can occur when the fellow networking services do not detect this because the restart is within the configured death detection time. Solution: When the networking service is restarted the restart is delayed for a short duration which is more than the configured death detection time to let the fellow networking service to detect the restart. |
OSPL-13505 |
Stalling alignment due to not answering a d_nameSpacesRequest When durability services discover each other they need to exchange capabilities and namespaces. The rule of the game is that a durability service can only send namespaces to another durability service when a capability of the remote durability service has been discovered, and when a request for namespaces has been received. Different threads are involved in processing the capability and the request for namespaces. There was a race condition between these these thread such that when a request for namespaces was received just prior to the capability, it could be the happen that no namespaces where send. This lead to stalled alignment. Solution: The race condition has been removed by properly locking the code section that was causing the race condition. |
OSPL-13506 |
AlwaysUsePeeraddrForUnicast DDSI option conflicts with Vortex Link Fault Tolerance When OpenSplice has Solution: AlwaysUsePeeraddrForUnicast implementation is now only controlling SPDP messages and the control for SEDP messages has been reverted. |
OSPL-13512 |
CopyIn/CopyOut function generated for IDL typedefs are not properly exported The copyIn/CopyOut functions for typedefs as generated by idlpp for the C/C++ targets are missing the import/export macro needed to invoke those operations from the context of another library/executable. This may lead to issues on Windows platforms if one IDL file is using a typedef from another IDL file, whereas the code generated for the latter IDL file is not contained in the same library/executable as the first IDL file. Solution: CopyIn/CopyOut functions for typedefs are now properly exported, |
OSPL-13514 |
shmdump cannot handle dump files bigger than 4.2GB When using shmdump to dump the contents of your shared memory to a file, you run into a number of issues: * You must manually specify the size of your shared memory as a mandatory command-line parameter to shmdump , which is cumbersome and not needed since the size can already be obtained from other sources. * The variable holding the size is only 32-bit, and therefore overflows when the shared memory size is > 4.2GB. Solution: Specifying the shared memory size is no longer mandatory, although still optional. Also the shared memory size is now stored in a variable type that corresponds to the platform width. |
OSPL-13516 / 00020513 |
Possible crash when creating shared subscribers or DataReaders simultaneously When two or more processes create a shared DataReader or subscriber, only the first succeeds and others slave to it. In specific circumstances the slaves could return a partially initialized entity resulting in undefined behaviour and/or a crash when the entity is used. Solution: The issue is resolved by adding a synchronization mechanism so a shared DataReader or subscriber is only returned when fully initialized. Note shared DataReaders and subscribers are a properietary DDS extension applicable in shared-memory deployment only and enabled by setting the Share QoS policy on a Subscriber and DataReaders. |
OSPL-13530 / 00020525 |
The RT Networking service could run out of fragmentation buffers when the backlog of several nodes grows. The RT Networking service is normally configured with a maximum number of fragmentation buffers. Fragmentation buffer are used to store the received network packets. When the maximum number of fragmentation buffers is exceeded a fatal error is raised and the Networking service terminates. A reliable channel maintains for each remote node a backlog of fragments which cannot be delivered because there are fragments missing. When the backlog of a certain node exceeds the configured threshold the backlog of that node is cleared and the node is temporally disconnected to inform the system that messages of that node have been missed. However it may occur that several nodes have a growing backlog and the maximum number of fragmentation buffers is exceeded before the backlog of one of these nodes is exceeded causing the Networking service to terminate. Solution: When the number of used fragmentation buffers reaches the configured maximum then select the remote node which has the largest backlog and clear that backlog to free fragmentation buffers. This node is marked temporally disconnected to notify that reliable communication with that node has been disrupted which will trigger the durability service to perform a possible realignment of the missing messages. |
OSPL-13580 / 00020360 |
Exception in Java FACE API caused by invalid unlock after incoming data callback When an exception occurs while handling incoming samples using the FACE callback mechanism, eg. a DataReader becoming invalid because the domain was terminated, a lock is always released even if the exception is raised before the lock is taken. Solution: Exception handling was modified to properly lock/unlock in all circumstances. |
OSPL-13203 |
Visual Studio 2019 requires MATLAB version R2019b as a minimum Visual Studio 2019 is not supported by MATLAB versions prior to R2019b so previous versions of OpenSplice with VS2019 have not included MATLAB. Solution: Builds of OpenSplice with Visual Studio 2019 now include MATLAB and require version R2019b as a minimum. |
Report ID. | Description |
---|---|
OSPL-11049 / 00018425 |
Unicast messages not received by networking service when using multicast on windows When the networking service is used on windows and the networking is configured to use multicast then the sockets which are used of sending are incorrectly bound to the same port number as being used by the sockets used for receiving messages. This may cause that unicast message (ACKs and resends) are not received by the networking service on windows. Solution: The sockets used for sending are bound to port number 0. By using port number 0 the socket will be bound to a random port. |
OSPL-13355 / 00020324 |
For the secure networking service the SyncMessageExchange option does not work correctly. The sync messages exchanged when the SyncMessageExchange option is set are not encrypted. The receiver tries to decrypt these messages which causes that these messages are dropped. Because the sync message is never acknowledged the sender will not purge the resend list causing the networking service to run out of memory. Solution: When secure networking is used the sync messages are encrypted. |
OSPL-13368 / 00020337 |
Simulink Model Validation errors with GOTO blocks and Subsystems When a Vortex DDS block entity port (e.g. domain participant or topic) was connected to either a GOTO block or a port on a subsystem, the DDS block would report an error during validation. Solution: Validation of entity port connections has been relaxed to allow connections to blocks such as GOTO and Subsystems. No changes are required to existing blocks or models. |
OSPL-13392 |
When the SyncMessageExchange is enabled it could occur that after a reconnect the communication does not restarts. When the SyncMessageExchange is enabled in RT networking service a sync message is sent when a reconnect occurs. When the receiver receives the sync message it starts waiting on a message with sequence number communicated through the sync message send by the sender. However it could occur that an old acknowledge message is received at the sender which incorrectly causes the sender to drop the message that the receiver is waiting for. Solution: After a receiving a sync message which resets the communication with a sending node the receiver should not send acknowledge messages for received messages older than the sequence number indicated by the sync message. Further at the sending side the message acknowledges and the acknowledge of the sync message should go through the same queue to ensure in order delivery. |
OSPL-13416 / 00020384 |
Closing sockets after SSL failure TCP Socket was left open after SSL connection failure such as using an expired or revoked certificates. DDSI tries to connect the TCP peer as long as application/federation stays running and consumes all socket resources of the operating system. Solution: TCP socket is closed after receiving an error during the SSL connection attempt. |
OSPL-13423 / 00020393 |
Merge policy "REPLACE" might not revive an aligned instance correctly Upon a reconnect to a durability master, the merge policy "REPLACE" will first dispose the current data-set, and then re-insert the aligned data set, meaning that all instances contained in the latter set that have not been explicitly disposed or unregistered by their owner should end up in the ALIVE state again. However, a bug in the instance state machine caused the aligned sample to be dropped if the user would take the initial DISPOSE from the start of the "REPLACE" policy, prior to the insertion of the aligned sample. Since the initial DISPOSE and the insertion of the aligned sample did not occur atomically, there was a slight chance that the user would consume the DISPOSE prior to the insertion of the aligned sample. Solution: The "REPLACE" merge policy has not been modified in such a way that the insertion of the initial DISPOSE message and the insertion of the aligned sample happen atomically. Because of that, the user is not able to consume the DISPOSE prior to the insertion of the aligned message. |
OSPL-13426 |
Durability fails to get complete when master becomes disconnected when resolving initial conflict. When the durability service starts it first has to determine a master for each of the namespaces to determine which of the present durability services is allowed to inject the persistent data. After selecting a master all the present durability services are asked for their topic/partition information. However when the master disconnects and no topic/partition information has been received from the master the durability services fails to become complete. Solution: When during initial the selected master becomes disconnect the master selection is restarted. |
OSPL-13427 / 00020396 |
Python API memory Leak on Waitset's wait function. In the wait function of the Waitset class, memory was allocated for _c_conditions to get all the triggered conditions attached to the waitset. It was never freed causing a memory leak on waitset wait. Solution: The memory leak has been fixed. |
OSPL-13474 / 00020421 |
The handling of the DCPSPublication and DCPSSubscription builtin topics cause a memory leak. The spliced daemon maintains a list of the samples of the DCPSPublication and DCPSSubscription builtin topics to perform some cleanup action when corresponding instances become not-alive. However the reference count of these samples may incorrect become increased when updating these lists causing a memory leak. Solution: When trying to insert an outdated sample which is rejected the reference count of the rejected sample is not incremented. |
OSPL-11116 |
DDSI ability to ignore advertised addresses and/or use peer/source address A problem can occurs when the advertised addresses of Link/Fog causes DDSI2 to disconnect and then try to reconnect to the wrong address. Solution: A new option TCP/AlwaysUsePeeraddrForUnicast is added that can be used to ignore an advertised unicast address. |
Report ID. | Description |
---|---|
OSPL-13235 / 00020125 |
Modified type generation for FACE C++ binding in idlpp The C++ FACE binding is built on top of our isocpp2 DCPS binding. Therefore types generated from IDL follow the IDL to C++11 specification. This doesn't match FACE specification which expects the classic IDL to C++ (03) specification. Solution: idlpp was modified to apply the classic C++ type mapping when generating code for FACE. Instead of accessing members using getters/setters as usual in isocpp2, members can now be accessed directly. Note the FACE binding itself is still built on top of isocpp2 therefore both include/dcps/C++/SACPP and isocpp2 directories must be added to the include path when compiling the generated code. |
OSPL-12629 |
In case alignment data is requested from multiple nodes, alignment may stall when a node leaves during alignment In order to determine whether an alignment action has completed, the number of expected samples is calculated from the number of samples that are aligned from each node, compensated for possible duplicates (because different nodes can send the "same" data, so this should result in a single expected sample). When a node leaves before the alignment has completed, the number of expected samples must be compensated too. There is a flaw in the algorithm to compensate the number of expected samples when a node leaves. This may lead to an incorrect number of expected samples, potentially causing alignment not finishing. Solution: The algorithm to calculate the number of expected samples has been corrected. |
OSPL-13100 / 00019942 |
The networking service may crash when receiving message for an unknown network partition. When the networking service receives a packet which does not relate to a networking partition known by the node is still tries to process the packet with an invalid partition reference which may cause that the networking service crashes. Solution: When a packet is received which does not relate to a known network partition the packet is ignored and an warning is logged. |
OSPL-13183 / 00020091 |
Unintended instance revival during re-alignment due to premature purging of readerInstances. Consider the following scenario:
Solution: By not purging the reader instance immediately after it was disposed/unregistered and left empty by the take operation, but by delaying the purge time til after the injection of the snapshot, we can avoid instance I1 from being revived because the now conserved history record of the instance will indicate the S1 has already been consumed. The instance purge delay will be derived from the sum of the following two parameters: the configured RetentionPeriod (in ms) of the Domain section in your config file, and the configured service_cleanup_delay of its topic's DurabilityServiceQosPolicy. This gives you the possibility to define a global minimum offset and a topic specific addition. |
OSPL-13197 / 00019796 |
The networking synchronization option sometimes fails to synchronize to nodes after a asymmetrical reconnect. A synchronization option was added to the networking service configuration to provide better recovery after an asymmetrical reconnect. However it could occur that the synchronization failed because an old ack message could purge the resend queue incorrectly. Solution: When the synchronization option is enabled the networking service will not purge the resend queue for a particular node when not having received an acknowledge of the synchronization message. |
OSPL-13211 / 00020101 |
The autopurge_dispose_all setting my cause that the reader instance is incorrectly purged when applying a REPLACE merge. In the situation that a durability performs a REPLACE merge alignment and the reader has the autopurge_dispose_all QoS policy enabled then the reader instance will incorrectly be purged and the sample injected as result of the REPLACE alignment will be rejected. A REPLACE merge action will first dispose the reader instance before injected the alignment sample. The dispose is misinterpreted as resulting from a dispose_all operation causing the purge. Solution: The flags of the dispose indicate that it is caused by a REPLACE merge action. The autopurge_dispose_all implementation checks the flags to determine if the dispose resulted from a dispose_all operation or not. |
OSPL-13255 / 00020103 |
Java WaitSet timeout influenced by non-triggering QueryConditions. A QueryCondition can be attached to a WaitSet. Then it is possible to wait for a sample that is accepted by the QueryCondition. A timeout can be added to that WaitSet wait. When a sample is received that is not accepted by the QueryCondition, the internal WaitSet wait is called with the original timeout, causing the wait duration to increase. Solution: When a sample is received that is not accepted by the QueryCondition, the timeout is re-calculated before calling the internal WaitSet wait again. |
OSPL-13283 / 00020201 |
Potential memory leak in generic copy functions for SAC and C99 Copy routines are used to copy samples from your language representation into the shared memory database representation and vice-versa. They are normally generated by idlpp and dedicated to the data types specified in your IDL file. However, for SAC and C99 it is also possible to use generic copy functions, that are not dependent on compile-time availability of your datatypes but that perform the copying based on run-time availability of your sample's metadata. This is useful for applications that have no compile time knowledge about the datatypes they will work with, and that have to discover their topic's metadata at runtime. An example of the use of a generic copy function can be the Python language binding, which is built on top of the C99 API, and which allows you to introspect an arbitrary Python sample (without having to pre-model it in IDL and then compile a dedicated copy function) and write it into DDS or read it out of DDS. The problem with this generic copy function was that is was suffering from a pretty serious memory leak for samples that contained a sequence of type struct (either a struct with subsequent references or a struct without subsequent references). Solution: The memory leaks have been fixed. |
OSPL-13300 |
The CM* builtin topics are not logged when builtin topic logging is enabled. When processing the received CM* builtin topics the log function incorrectly discards these topics. Solution: The builtin topic logging function accepts all received builtin topics and logs them to the log file. |
OSPL-13314 / 00020254 |
Registering a typesupport using the generic type of the classic C API may cause a crash in a multi domain context. The classic C API provides functions to use generic types. When a generic TypeDescriptor is registered with a DomainParticipant it will generate information to copy-in/out the corresponding topic data. This information contains a reference to the type descriptor which is stored in the associated database. However when multi domains are used and the same TypeSupport is registered with different domains the second domain will use the copy-in/out information generated for the first domain. This will cause a crash when writing a topic. Solution: When registering a generic TypeDescriptor with a domain it is checked if it was already registered with a other domain and when that is the case generate the copy-in/out information again for the current domain. |
OSPL-13325 / 00020264 |
C# backend of idlpp may generate invalid demarshaling code for IDL unions with no contained references The C# backend for idlpp generates demarshaling instructions that cause C# compilation errors for an IDL union with no contained reference types (i.e. it contains no strings or sequences, neither directly nor indirectly) but with one or more branches that hold a struct or union. Solution: The C# backend has been modified to generate the correct demarshaling code |
OSPL-13336 / 00020277 |
Networking bridge service does not forward topics When using the networking bridge service user topics did not get forwarded to the other network service when DDSI is used as network service. Solution: The defect in the DDSI service is solved and user topics are forwarded again. |
OSPL-13338 / 00020274 |
The spliced daemon may crash when a reconnect of a remote system is detected. When the spliced daemon receives a heartbeat from a previously disconnected remote instance it updates the DCPSPublication instances associated with that remote system. However these DCPSPublication instances may contain invalid data causing the crash. Note that the update of the DCPSPublication instances is not necessary because they are already updated by the durability service in case of the use of RTNetworking and or by the ddsi service. Solution: When a remote system heartbeat is received by the spliced daemon it updates the information about the alive systems which is used to determine if data that is aligned through the durability service is alive or not. The update of the DCPSPublication is performed either by the alignment of the builtin topics through the durability service or by the ddsi service as result of the discovery process. |
OSPL-13342 |
Simulink Reader Block did not produce valid Sample Info during simulations During simulations, the Simulink Reader block's 'info' port did not produce meaningful output. Code generated with Simulink Coder does not suffer from this problem. This problem was introduced with OpenSplice 6.8.3. Solution: The Reader block has been corrected to provide correct sample information on the 'info' port. |
OSPL-13351 |
When UseSyncExchange is enabled it may cause synchronization between the sender and receiver fails The UseSyncExchange option should enable that the synchronization on the first packet by both the receiver and the sender is improved to minimize unnecessary packet loss. The sender selects the first packet either based on the first acknowledge from the receiver or from the first packet send after detecting the receiver. However the receiver may already have received an older packet for which the acknowledge has not yet been received by the sender. The synchronization message sent to the receiver includes a lower bound for the sequence numbers that the receive may accept and are available for resending by the sender. This lower bound is set to restrictive which causes that packet the receiver is waiting for are rejected. Solution: The lower bound of synchronization message is set to what is available for resending. |
OSPL-13352 |
An overload situation at the networking receive channel may cause that acknowledges are not sent in time. To prevent packet loss in the receive socket the networking service tries to read the receives sockets with priority. When the socket contains a high number of packets reading all these packet may stall the the sending of acknowledges This may cause that other nodes remove this node from the reliable protocol because acknowledges are not received in time. Solution: The maximum number of bytes read from a socket each iteration is limited to enable the networking receive channel to handle sending of acknowledges and delivering received messages to the applications in time. |
OSPL-13354 / 00020323 |
Linker may detect missing symbols for idlpp generated C/C99 code. When using an anonymous sequence Solution: We now also generate the context dependent allocation implementation, so that even people that like to use those in their applications (although this is not according to the IDL-C language mapping) will no longer run into linker errors. due to missing symbols. |
OSPL-13360 |
Potentially stalling alignment when many nodes are discovered simultaneously When durability services discover each other, the durability protocol starts by exchanging namespaces. This is done by sending a so-called nameSpacesRequest message from one durability service to another. The other durability service then responds with sending its namespaces. In situations where many nodes are discovered more or less at the same time and there are many namespaces to exchange, it is possible to end up in a state where one node is waiting for namespaces of the other node, but the other node is asymmetrically disconnected and never the receives the request for namespaces. Because exchanging namespaces is a precondition for alignment, failing to kickstart can cause alignment to fail. Solution: A mechanism has been implemented to resend a nameSpacesRequest when it is detected that a remote node has experienced an asymmetric disconnect. As an additional safety measures a nameSpacesRequest is resend if it takes too long to answer. The default value is set to 30.0 seconds, but this value can be overridden by setting the environment variable OSPL_DURABILITY_MAX_PERIOD_WAIT_FELLOW_NS. When set to 0.0 the retry mechanism is disabled. |
OSPL-13361 / 00020179 |
Tester created default readers unable to read historical data samples. The default volatile readers created by the Tester were unable to receive historical data samples written from the transient durability writers. Solution: A new boolean property “scriptDefaultReaderWait” is added to the “tester.properties” file. The default value of this property is “false”. Setting it to “true” will make the Tester default reader created by the script to wait for the historical data samples. This solution is for scripts only and would not affect the UI reader creation. |
OSPL-13366 / 00020336 |
Memory leak in listener events not yet processed when application terminates Listener events can leak shared-memory in v_readerStatus event-data under specific circumstances where the application is terminated before queued events can be processed by the listener event thread. Solution: An internal change involving proper ref-counting of event-data ensures the relevant memory is free'd in all circumstances. |
OSPL-13369 |
Durability may find no aligner for a namespace when using legacy master selection. The initial master selection may find no aligner in case of legacy master selection when during the initial master selection the namespaces from possible aligners are not yet received. It will decide just after two iteration over the received namespaces that it could not find an aligner for the namespace and continues without selecting an aligner. Solution: The durability service will wait for an aligner to become available during the initial alignment phase. |
OSPL-13388 |
The spliced daemon may crash when receiving invalid heartbeat messages. When processing received heartbeat messages the spliced daemon accesses the corresponding instance to determine the instance state of the received heartbeat sample. When the corresponding reader only contains invalid messages the read operation may remove the instance. Solution: When reading heartbeat samples also the corresponding instance is kept alive during the processing of the heartbeat sample. |
Report ID. | Description |
---|---|
OSPL-7334 |
When two nodes become disconnected and reconnect again, the DELETE merge policy deletes the data set even when nothing has changed. When nodes become disconnected and reconnect again the configured merge policy is applied. If the merge policy is DELETE then the data set will be deleted. This is undesired when there is no difference between the data sets of the nodes involved. Solution: The DELETE merge policy will only be applied when there is a change in data sets between the nodes. NOTE: this change in behavior is currently only implemented when the equalityCheck is configured. Without the equalityCheck there is no difference. |
OSPL-12768 / 00019538 |
Alignment may stall when a new local group is created while a merge request to acquire the data via a merge for the same group is ongoing. When a durability service learns about a partition/topic combination, it may have to acquire data for this group by sending a request for samples to its master. When at the same time a merge conflict with the fellow is being handled, this may also lead to the sending of a request for samples to the same fellow. Both paths are for performance reasons decoupled, and so there is a window of opportunity that may lead to two identical requests for the same data to the same fellow. Since the requests are identical only one of them is answered. The remaining one never gets answered, with may potentially stalls conflict resolution. Solution: The requests are distinguished so that they are never identical |
OSPL-12862 / 00019625 |
Alignment may stall when a new local group is created while a merge request to acquire the data via a merge for the same group is being scheduled When a durability service learns about a partition/topic combination, it may have to acquire data for this group by sending a request for samples to its master. When at the same time a merge conflict with the fellow is being handled, this may also lead to the sending of a request for samples to the same fellow. Both paths are for performance reasons decoupled, and so there is a window of opportunity that may lead to two identical requests for the same data to the same fellow. Since the requests are identical only one of them is answered. The remaining one never gets answered, with may potentially stalls conflict resolution. Solution: The requests are distinguished so that they are never identical |
OSPL-13101 |
Premature alignment that can potentially lead to incomplete alignment When nodes states have temporarily diverged and they get reconnected again, their states must be merged according to the configured merge policy. Typically, this leads to requests for data from one node (say A) to the other (say B), and the other nodes sends the data. Before these requests for data are send, node A node first retrieves which partition topic combinations (so called groups) exist on B, so that it can ask for data for these groups. To do that the requesting node A first sends a request for groups to B, and B replies by sending all its groups in newGroup messages, one by one. Each message contains an indicator that indicates how many groups there are, so A knows how many groups to expect before requests for data can be send out for each group. When node B creates a new group, this also leads to the publication of a newGroup message which is received by all durability services. This message has an indicator that the number of expected groups is 0. If a new group is created while groups are being exchanged, then it can happen that the number of expected groups is reset to 0, causing the durability service on A to prematurely think that all groups have been received, and so node A starts requesting data from B without having acquired all groups. Because A sends requests for data for all groups that it currently knows, node A will only acquire data for a subset of the group that B knows. Solution: When a new group is created the number of expected groups are not reset to 0 any more. |
OSPL-13106 |
Bug fix for segmentation fault when writing union data idlpp used to generate incorrect code for serializing structs containing unions. We have fixed this problem. Solution: We serialize structs (Python objects) by copying them into a buffer with a packing format. Previously, the packing format for structs containing unions was generated incorrectly. We have fixed this problem by adding a custom calculation for the packing format of unions inside structs. |
OSPL-13133 / 00020000 |
The tuner and tester incorrectly removes spaces from string values. The tuner and tester use the CM api. Topic data which contains string or character values may be handled incorrectly by the tuner or the tester because the CM api removes leading and trailing spaces incorrectly from string values. Solution: When processing string values the CM api leading and trailing spaces are not removed. |
OSPL-13205 |
Deficiency in deserialization of dds security message fixed Deserialization for string types was producing one character memory violation sometimes. This situation might cause unexpected behaviour when the character after the string buffer is not null. Solution: A fix applied to limit the memory access within limits. |
OSPL-13228 / 00020129 |
FACE Read_Callback interfaces are now compliant with CTS requirements The Read_Callback interfaces generated by the idlpp backend for the FACE were not compliant with the requirements as set out by the latest FACE CTS. Solution: The Read_Callback interfaces are now generated using the correct names. For example for a type named Type1, the interface was named Type1Read_Callback, but now according to the new CTS requirements it is called Read_Callback_FACE_DM_Type1. |
OSPL-13239 |
Application crash on query condition read or take. Read /Take on a query condition that implements a like expression on a string type key can result in a segmentation violation when an instance is unregistered or disposed. An internal memory optimization removes key fields from unregister and dispose messages so the query should therefore not address these fields, however due to another string key optimization this does occur and results in an segmentation violation. Solution: added an additional internal test to detect this use-case and take a different approach and avoid addressing these fields. |
OSPL-13247 |
OpenSplice installer causes Windows to reboot When Visual C++ Redistributable for Visual Studio 20XX is not installed and the OpenSplice installer triggers the installation a Windows reboot can occur causing the installation to fail. Solution: The Visual C++ Redistributable for Visual Studio 20XX will be installed without triggering a system reboot. |
OSPL-13252 |
When a node requests groups from another node, the answer is send to all nodes instead of the requestor only As part of aligning data sets between nodes, information about the available groups (i.e., partition/topic combinations) must be exchanged between the nodes. This is done by having one node sent a request for groups to another node, and the other node sends back the groups that is has. Instead of addressing the groups to the requesting node, the other nodes sends back the groups to everybody. This may cause unnecessary processing by nodes that have not requested the groups. Solution: When a node sends a request to another node, the answer is directed to the requesting node only |
Report ID. | Description |
---|---|
OSPL-13212 |
The networking bridge service does not work correctly in combination with the networking service. The problem is that the publication and subscription match statuses are not correctly updated. The cause is that the alive status of a system is used in when aligning the DCPSPublication topic. Because the alive status of a node at the other side of the bridge is not passed on the corresponding builtin topics will not be inserted as alive when they are aligned by the durability services. This causes that the publication and subscription match statuses are not updated. Solution: Builtin topics received through alignment make the corresponding instance alive which causes that publication and subscription match statuses are correctly updated. |
Report ID. | Description |
---|---|
OSPL-13188 / 00020092 |
In the isocpp2 api the SampleInfo equal operator behaves incorrectly. When comparing two SampleInfo values for equality the value of the instance state attribute is not compared correctly. Solution: When comparing two SampleInfo values for equality the instance state attributes are included in the comparison. |
OSPL-11802 / 00018828 |
Anonymous sequences represented in C or C99 may result in duplicate symbols during linking When using the same anonymous sequence in multiple IDL files, linker errors might appear when linking the object files representing the C or C99 representation of these IDL files together due to duplicate symbols. The collisions are caused by the allocator functions generated for the anonymous sequences, which are bear the same name. Solution: The allocator functions generated for anonymous sequences are now always scoped by a unique context, i.e. the struct/union containing the the anonymous sequence or the typedef that references it. This way name collisions are avoided. The original function names are still available as conditional macro's pointing to these scoped allocator functions. If you pull in more than one implementation for an anonymous sequence, only one macro definition will be available, pointing to only one of the scoped functions (which should all be interchangeable). |
OSPL-12800 |
Implemented ^C handling for OpenSplice applications on Windows Until now OpenSplice did not implement a Windows console termination handler for applications and this required applications to install their own termination handler and safely disconnect from the DDS Domain. However the OpernSplice Service could become corrupted in case an application performed an exit in their termination handler leaving the shared memory in an unstable state. Solution: OpenSplice now implements a console termination handler which will safely disconnect from the Domain and then pass the event to any other termination handler. If applications had set their own termination handler it will be called after being disconnected so an exit will no longer leave the shared memory in an unstable state. Any OpenSplice calls performed by the application or application handler however can receive an 'already deleted' error code / exception because the domain was disconnected and should process it. |
OSPL-12887 |
The generic read/write functions provided by the SAC and C99 API do not support recursive types. The generic read and write functions provide by the SAC and C99 APIs use the type information to copy the topic data. When the topic type definition contains recursion then these generic read and write function may crash depending of the kind of recursion that is used. The problem is caused because the size of the recursive type is calculated incorrectly causing misalignment of the members of a struct or a union. Solution: The size of a type that is used recursive is calculate first to enable the correct calculation of the alignment. |
OSPL-12899 / 00019745 |
Missing files for Simulink Vortex DDS blockset library validate ports feature. A new feature for validating ports introduced a bug that threw exceptions when using the Simulink Vortex DDS blockset library. The exceptions were caused by missing files in the blockset library. Solution: Missing files added into Simulink Vortex DDS blockset library. |
OSPL-13023 / 00019800 |
OpenSplice - OpenDDS authentication handshake problem was fixed for DDS Security interoperability. OpenDDS was not accepting some of the OpenSplice authentication handshake messages. As handshake is the initial step for communication, it was impossible to interoperate with OpenDDS. Solution: A field in the handshake message was fixed, and permissions file parsing behaviour was updated to handle non-given partition to be evaluated as default partition. Current interoperability with OpenDDS is as follows: * Communication only works when Discovery protection kind is ENCRYPT. Protection can be still enabled/disabled for each topic (enable_discovery_protection) in the governance file. * SIGN protection kind is not working for any case * RTPS protection is not supported in OpenDDS * Metadata protection and data protection work with ENCRYPT protection kind. Only one protection can be used at a time. |
OSPL-13048 / 00019919 |
The error and warning messages that include topic names are removed from log files for discovery protected topics. The local permission errors are written as messages in the error log and the remote permission errors are written as messages in the info log. However, topic names can be seen on the logs which can be non-secure for discovery protected topics. Solution: Permission errors are not logged for the following situations: * creation of local topic, datareader and datawriter * match of remote topic, datareader and datawriter |
OSPL-13074 / 00019944 |
Durability may crash when handing an asymmetrical disconnect. When an asymmetrical disconnect is detected by durability it will clear the namespace administration associated with the fellow that is considered disconnected. However there may still be a merge action in progress that is associated with that fellow and which tries to access the deleted namespace object. Solution: When using the namespace object it's reference count is incremented to ensure it can be accessed safely. |
OSPL-13080 / 00019948 |
DDS Security decoding payload may fail when the payload contains only a key. For a dispose message only the key of the corresponding instance is transmitted. In that case the payload provided to the crypto plugin to encode the payload could not be aligned on 4 bytes. The alignment is later added when transmitting the encoded payload. This causes that the receiver is not able to decode the payload because of the added padding bytes. Solution: Always add padding bytes before calling the crypto plugin to encode the payload. |
OSPL-13094 / 00019955 |
Networking service may crash when using synchronous write. To support the synchronous write functionality a special builtin reader is used. This builtin reader is a specialized reader which has different attributes as the normal application data readers. When the networking service reports a sample lost event then a crash may occur when the sample lost event is reported on this special builtin reader. Solution: Sample lost events are only reported on normal data readers. |
OSPL-13112 / 00019983 |
The RT Networking may crash when a short disconnect occurs. When networking receives messages it will first put these messages in a queue before processing them further. For a reliable channel the message read from this queue are put in the out-of-order administration which is related to the sending node. When networking considers a sending node as not responding it will clear the administration related to that sending node. When reconnect is enabled and networking receives again messages from that node it will resume reliable communication and considers that node alive again. Reliable communication is resumed from the first message it receives again from that node. However when the disconnect is very short there may still be old message from that node present in the internal queue which causes problems when they are put into the out-of-order administration related to that sending node. Solution: The out-of-order administration rejects message that are considered old. |
OSPL-13116 |
Durability may leak some memory when configured with delayed alignment When delayed alignment is configured alignment requests may leak when there is a previous request which is still waiting to be processed and the new request is ignored as a duplicate. Solution: Free the memory of an alignment request when it is considered a duplicate. |
OSPL-13118 / 00019991 |
DDS Applications unresponsive for termination signals. Application signal handlers for SIGQUIT, SIGPIPE, SIGINT, SIGTERM and SIGHUP are no longer executed after implementing a fix for internal DDS service zombie processes caused by user sent kill signals. The problem was caused by a signal mask that prevented passing the signal to the application handlers. In addition the default OpenSplice termination handler was called when an application handler was set which is unintentionally and could prohibit applications to divert termination requests. And that for window applications any application defined nor the default CntrHandlers where not invoked after executing the OpenSplice CntrHandler. Solution: The Signal mask is lifted during the re-raise of the signal so that it is forwarded to application handlers. No OpenSplice termination handler is set when the application has set a termination handler. In Windows application OpenSplice no longer consumes the console event so that the event is passed to other CntrHandlers. |
OSPL-13124 / 00019998 |
Fixed some typo's and clarified information in deployment manual and configurator. Some typos have been caught in the Deployment manual and some clarifications were suggested, especially with respect to the notation of ranges. Solution: The typos have been fixed and a more mathematical notation has been used to indicate valid ranges for the parameters in both the Deployment manual and the Configurator. |
OSPL-13125 |
Transient_local instances which are disposed may not be removed when there are no writers. All durable topics (TRANSIENT, TRANSIENT_LOCAL and PERSISTENT) instances are maintained in a local storage. Instances which are disposed and have no writer will be purged from this storage taking into account the service_cleanup_delay. This storage can be in the complete or incomplete state. When the durability service is performing an alignment then the storage is set in the incomplete state and when in the incomplete state the storage is not purged to ensure proper alignment to take place. However when the configuration is using the durability service in combination with the ddsi service the alignment of transient_local is a task of the ddsi service and the durability service will not take responsibility for the storage containing the transient_local instances. This causes that this storage remains in the incomplete state and purge of the storage is suppressed. Note that when the RT networking service is configured the durability service will take responsibility to align the transient_local topics. Solution: When the durability service is not responsible for the alignment of the transient_local topics then the purge suppression of the transient_local storage is disabled. Further transient_local instances which have no writers will also be purged from the local storage to prevent late-joining readers to receive transient_local data when there is no writer of that data. - Since there is one combined storage for all Writers we cannot handle scenarios where Readers expect all samples from all Writers (e.g. Reader = KEEP_ALL or reader has history_depth > 1). - Neither can we handle scenarios where Readers selectively pick samples from a variety of sources (e.g. because of ContentFiltering on the Reader side, or because of differences in RxO QosPolicy settings.) - When Readers use BY_RECEPTION_TIME ordering, they might receive some samples again when a late joining Reader is created on the same federation. - The combination of a non-TRANSIENT_LOCAL topic with a TRANSIENT_LOCAL reader/writer will also not work as might be expected with respect to determining completeness of the data set. |
OSPL-13174 |
When detecting a disconnect of an node the corresponding instance may not be unregistered. When the spliced daemon detects that a node has become disconnected it should unregister all instances written by that node. For this purpose information present in the corresponding DCPSPublication builtin topics is used. Note that in this situation an alive and a disposed sample of the DCPSPublication are present. Instead of using the alive sample the disposed DCPSPublication sample is used which only contains the key. Solution: The alive DCPSPublication builtin topics associated with the disconnected node is used to unregister the corresponding instances. |
Report ID. | Description |
---|---|
OSPL-12843 / 00019612 |
OpenSplice remains in a degraded state when a term signal is send to an internal service like e.g. durability. When a term signal is send to an internal service it will terminate as if it was instructed by the spliced as if a normal shutdown is requested. However, the rest of the system will not shutdown, the terminated service will remain as a zombie process and the rest of the system will remain in a degraded state. Solution: A term signal handler is installed for internal services that will notify the spliced to shutdown OpenSplice when a term signal is received. |
OSPL-12959 |
Tester unable to write struct fields for union type where the union does not have any case for a given enum. In the Tester, the Edit Sample dialog of the "Sample List" window was showing incorrect value for a union type where the union type does not have a case for a given enum. Any attempt to write or dispose any sample containing such union type would fail, throwing a CMException on write. Solution: The problem is fixed. Now tester can write and dispose samples with union type. |
OSPL-12964 / 00019796 |
Durability does not reach the operational state
When a durability service starts it is checked if all namespaces have a confirmed master before reaching the operational state. This check contained an error which could cause that a durability service did not reach the operational state. Solution: The condition used in the check has been fixed. |
OSPL-12965 / 00019797 |
First message send after detecting remote node may not arrive at that node. The durability service is dependent on the reliable delivery of message provided by the networking service. When the durability services detects a new fellow (node) it send a capability message to the new detected fellow. However under very high load circumstances it may occur that this first message may not be delivered by the networking service. The networking service will provide reliable communication to a remote node on a particular networking partition when it has received the first acknowledge from the remote node on that partition. However when the first number of message send after detecting the remote node do not arrive at that node for at least a duration longer than recovery_factor times the resolution interval it may occur that this first message is not delivered. Solution: The configuration option SyncMessageExchange is added to enable sending of a synchronization message to a newly detected node. By default this option is disabled because older versions do not provide this option. When enable a synchronization message is sent repeatedly until a corresponding acknowledge is received or the configured timeout expires. When this option is enabled the receiving node will wait with the delivering of the first received message until a synchronization message is received or the configured timeout expires. |
OSPL-12966 / 00019798 |
Add the option to the networking service to include the send backlog in throttling calculation. The calculation of the amount of throttling is using the number of receive buffers that are in use the receiving nodes. A receiving node reports the number of used buffer to the sender in the acknowledge messages. However when there is a high load on the system message may be dropped in the network or in the socket receive buffers of the receiving node. At the sending node an increase of the number of unacked messages may indicate that there is some network congestion occurring. By including the number of unacked messages in the calculation of the throttling factor a sender may react better to network congestion. Solution: The ThrottleUnackedThreshold configuration option is added. When set to a value higher than zero it will enable the number of unacked message to be included in the calculation of the throttling factor. When this option is enabled then the number of unacked bytes that exceeds the ThrottleUnackedThreshold are used in the calculation of the throttling factor. |
OSPL-12967 / 00019799 |
Networking is not able to resolve asymmetrical disconnect correctly. The reliable channel of the networking service will consider a remote node as died when it did not receive an acknowledge in time which is controlled by the Resolution, RecoveryFactor and MaxRetry configuration parameters. When a reconnect occurs it is possible that the remote node did not notice the disconnect and may still be waiting on a particular message to arrive. However this message may not be present anymore at the sending node. This may cause that at the receiving node the reliable backlog exceeds the threshold or when this occurs for more than one node at the same time that the total number of de-fragmentation buffers exceeds the threshold resulting in a termination of the networking service. Solution: The configuration option SyncMessageExchange is added to enable sending of at reset message when a remote node reconnects. By default this option is disabled because older versions do not provide this option. When enable the reset message is sent repeatedly until a corresponding acknowledge is received or the configured timeout expires. The reset message contains the sequence number of the first message that is available and the next sequence number to be sent. This allows the receiving node to reset it's reliable administration. |
OSPL-13019 / 00019770 |
QoS change of time-based filter policy not effectuated Changing the time-based filter QoS policy in some circumstances did not result in the new value being applied to DataReaders with a reliable reliability QoS policy. Solution: The code responsible for QoS changes was fixed |
OSPL-13030 / 00019840 |
Idlpp generates incorrect copy routine for type containing typedef of sequences for isocpp2 When a type definition contains a typedef of a sequence type where the sequence type is another typedef of a sequence then the copy routine generated for the isocpp2 language binding by the idlpp pre-processor is incorrect and causes the application to crash. Solution: The generated copy routine is corrected. |
OSPL-13053 |
The include files generated by idlpp do no have an unique guard macro. The include files generated by the idlpp pre-processor contain a guard macro which is derived from the basename of the source idl file. This may cause problems when idl files have the same basename but are located in different directories. Solution: When the "maintain-include-namespace" option is provided to the idlpp command then the guard macro generated by idlpp will contain an unique prefix generate from the md5 hash of the contents of the idl file. |
OSPL-13060 / 00019932 |
Unrecognized define in isocpp2 for TRUE and FALSE When using isocpp2 with an idl that contains a boolean a compile error could occur that TRUE and FALSE are not defined. Solution: The defines are now properly initiated and the native true and false keywords are now used for booleans generated by idlpp for isocpp2. |
OSPL-419 / 9168 |
For RT networking it is possible to silence tracing, but this is not possible for durability. In situations where log files become too large tracing may have to be suppressed. In RT networking this could be done by setting the enabled attribute of the Tracing element to false. A similar approach was not possible for the durability service. Solution: The durability service now offers a possibility to suppress tracing by specifying the attribute //OpenSplice/DurabilityService/Tracing[@enabled]. This attribute is optional, and defaults to TRUE. Setting it to FALSE will silence tracing of the durability service. |
Report ID. | Description |
---|---|
OSPL-12962 / 00019787 |
Python API memory Leak on calls to class serialize when serializing strings When serializing a Python object for writing containing string fields, there is a global variable in ddsutil package called _global_packed which holds the Python reference to the strings while things are organized to be input to struct.pack, then cleared after the serialize routine is finished. When using the serialize code generated by idlpp, however, the _global_packed is never cleared, leading to an ever growing list of python strings. Solution: Idlpp generated Python code now clears the _global_packed list after every call to _serialize. |
OSPL-12955 / 00019790 |
QoS Provider URIs not sufficiently validated When a QoS provider is created with an URI referring to a path instead of an XML file, a crash occurs. Solution: The URI validation of common QoS provider code (shared by all relevant PSMs) was improved to return an error when URI refers to a directory instead of a file (or symlink on supported platforms). |
OSPL-13006 |
GCC warning in idlpp-generated SACPP code (extra semicolon after namespace) When using gcc with strict options (i.e. -Wpedantic), a warning is triggered when compiling idlpp-generated classic standalone C++ code, due to an extra semicolon after the DDS namespace closing bracket. Solution: The warning is caused by a recent change, part of the new idlpp backend (OSPL-11549 / 00018613) and fixed by not outputting the semicolon during code generation. |
OSPL-12842 / 00019595 |
Durability may not notice a temporarily disconnect caused by the reliable backlog threshold being exceeded at networking level. When a receive channel is overloaded or does not get enough resources then it can cause that ack messages are not send in time. This may result in an asymmetric disconnect at the sender side. This may cause that at the receiver side an expected packet is never received causing the reliable backlog threshold to be exceeded. This may cause a short disconnect and reconnect to occur at the receiver side which may not be noticed by the spliced daemon and the durability service. This may cause that durability does not receive an expected message. Solution: When the networking receive channel detects that the reliable backlog is exceeded it declares the corresponding sending node dead for some time to prevent that message still available in the receive socket may reconnect the sending node immediately. Further the durability service reacts not only on the disposed state of the heartbeat but also on the no-writer state which indicates that a remote node has died and may become alive shortly thereafter. |
OSPL-12846 / 00019611 |
Improved inter-operability with Twin Oaks CoreDX Strict QoS validation causes OpenSplice to ignore CoreDX endpoints which use all zero values for the DurabilityServiceQos policy in RTPS-DDSI protocol messages, which is illegal according to the specification. Solution: It was decided in coordination with Twin Oaks to ignore the illegal DurabilityServiceQos in certain harmless cases (volatile or transient-local DurabilityQos) and allow communication between OpenSplice and CoreDX endpoints. |
OSPL-12852 / 00019601 |
Improve max-samples threshold warning reports and configurability
The max-samples (and samples per instance) threshold warnings, when triggered through one of the PSMs, would imply a more serious error or in other circumstances would not be reported at all. Also it was not possible to disable the threshold warnings by configuration file. Solution: The report mechanism was changed so the warnings are consistently reported at the appropriate verbosity. The relevant configuration parameters (//Domain/ResourceLimits) now accept a value of '0' to disable the reports. |
OSPL-12864 / 00019622 |
Tuner is not able to import data file in XML format. When the Tuner tries to import a data file in XML format and this data file contains whitespace in the text elements or contains windows line endings then it fails to correctly convert strings to numeric values. Solution: The XML parser used by the Tuner strips whitespace from the text elements. |
OSPL-12866 / 00019627 |
Face API hanging when sending NPE to send_event function When using the Java Face API and a NullPointerException is generated inside the send_event function the system wont respond anymore due to the connection being closed. Solution: The exception mechanism is adjusted to cope with this exception and the system will continue to respond as the connection will be kept alive. |
OSPL-12868 |
DDS Security may incorrectly report that secure remote readers or writers do not have participant permission handle. When DDS Security is enabled it may occur that the remote participant is not yet approved but already discovery information about the associated remote readers or writers is received. The permissions of the remote reader and writer are checked again when the remote participant becomes approved. However in this situation incorrect warning and error messages are logged. Solution: In the case that discovery information about remote reader and writers arrive before the associated participant is approved the warning and error logs are removed. |
OSPL-12870 |
Incorrect XSD references in DDS Security configuration files The XSD references in bundled DDS Security configuration templates and example DDS Security configuration files were incorrect Solution: XSD reference in Governance file was updated as https://www.omg.org/spec/DDS-SECURITY/20170901/omg_shared_ca_governance.xsd XSD reference in Performance file was updated as https://www.omg.org/spec/DDS-SECURITY/20170901/omg_shared_ca_permissions.xsd |
OSPL-12875 / 00019629 |
Incorrect handling of dispose-all topic operation The dispose-all operation is processed by remote nodes with help of the C&M Command builtin-topic. Processing of this topic by spliced, contained a flaw which could result in publication of a new C&M Command. This would lead to the same instances being disposed multiple times after using the dispose-all operation. Solution: The processing was fixed to publish a C&M command no more than once. |
OSPL-12876 |
Timestamps for DomainParticipant status logs in the Domain BuiltinTopic logfile are not according to the standard Time format. Timestamps for DomainParticipant status logs in the Domain BuiltinTopic logfile are not according to the standard Time format. Timestamps are logged as seconds.nanoseconds e.g. '1552577680.107568118' but should be logged in the standard time format e.g. '2019-03-14T16:34:40+0100 1552577680.107568118' Solution: The format of the logging is changed to the standard Time format. |
OSPL-12878 / 00019633 |
The use of the DDS Security plugins may cause a crash on some platforms. The DDS Security authentication and the access_control plugin both contain a function with the same name. On some platforms this may cause that the linker links the wrong function which cause an crash during initialization of DDS security when this function is called. Solution: The corresponding functions are made static to prevent visibility. |
OSPL-12888 |
When using DDS Security and enable RTPS encryption memory is leaked. When using DDS Security and the configuration specifies that RTPS message encryption is enabled then a memory leak is present at the receiving (decode) side. Solution: The memory leak is resolved. |
OSPL-12907 / 00019737 |
A DataReaderQuery creation can fail when the data type contains a long long attribute which is also addressed in the query expression and compared to a positive integer constant. E.g, something like the following query expression : "myLongLongField = 5" The problem is caused by an internal database query validation function that verifies if the field type and constant type are compatible for comparison. This operation didn't grant comparison between unsigned long long (which is the default type for positive expression constants) and signed long long fields. Solution: The operation is corrected to also grant comparison between signed and unsigned long long fields and constants. |
OSPL-12911 |
Tester unable to read struct fields or write sample of recursive data type.
Tester was able to read samples of IoTData type topic (containing recursive data type), but fields within a recursive IOT_NVP_SEQ were missing. Any attempt to write or dispose any sample containing recursive IOT_NVP_SEQ would fail, throwing a CMException on write. Solution: The problem is fixed. Now tester can correctly read, write and dispose samples of recursive data type. |
OSPL-12919 |
A deadlock situation exists between a group coherent transaction becoming complete as result of an alignment action performed by the durability service and an application deleting a coherent DataReader. The problem is caused by internal locking reversal between group coherent administration and kernel group data forwarding. Solution: Kernel groups can insert transactions into the group coherency administration in parallel and must be under control of locking whereas flushing completed group coherent updates is performed by only one thread and is allowed to be performed without locking the group coherency administration as soon as the completed transaction is removed from the admin. Moving the flush outside the locked section solved the deadlock issue. |
OSPL-12935 |
Potential deadlock when deleting a participant When a DomainParticipant is deleted by the user, it instructs the threads it spawned (leaseManager thread, signalHandler thread, resendManager thread, sharedMemoryServiceMonitor thread and watchdog thread) to terminate after which it tries to join those threads. However, the leaseManager thread has a little glitch that could cause it to miss its termination event, after which the participant is waiting indefinitely to join the leaseManager thread. Solution: The glitch in the leaseManager thread has been fixed, removing the chance that the participant will get into this deadlock. |
OSPL-13004 |
Detecting a remote topic that is not allowed locally by DDS Security permissions, shuts down the local DDSI service. DDS Security permissions can be used to allow or deny the creation of topics. Consider a setup where node A is allowed to create 'mytopic', while node B is not. Node B will still receive the topic discovery data from node A. At some point, remote topics were confused as local topics and checked against the local permissions. Because node B thought it was handling a denied local topic, it shut down DDSI as per design. Solution: Detect whether a topic is remote or local before checking against local permissions. |
OSPL-13009 |
When duplicate data is received (e.g., due to alignment) this data is forwarded to readers and may lead to resurrection When a node receives data then this data is injected in the internal administration. If the data has been received earlier (which can happen for instance after a reconnect) then there is no need to forward the data to the readers because the readers already received it. However, the data was still forwarded to the readers. This could lead to resurrection in case the reader previously took the data and the instance because empty and no_writers. Solution: Duplicate data is not forwarded to readers anymore |
OSPL-12792 |
Unable to use Java language bindings on Java 11 JRE Due to the removal of the CORBA package from the JRE in Java 11 (more info: http://openjdk.java.net/jeps/320), certain exception classes are no longer available without using a 3rd party ORB. Solution: Since the use of an ORB is undesirable in classic standalone Java and Java5 PSMs the missing classes are included in the PSMs. A slightly modified version of the Glassfish ORB (https://github.com/eclipse-ee4j/orb) exception classes (specifically SystemException, BAD_OPERATION, BAD_PARAM, MARSHAL and NO_MEMORY) is compiled and packaged in the jars. Note: The use of RMI is not supported on Java 11. |
OSPL-12853 |
Listeners may receive an event that no longer applies to the listener mask just after the listener mask has changed or in case of debug builds crash on an assertion in the OpenSplice library (v_listener code). The problem was caused by incorrect event list management in the OpenSplice core, leaving old events behind which are then passed to the application after the mask was changed and in debug builds triggering an assertion that expected an empty event list. Solution: The issue in the event list management is fixed, obsolete events are no longer left behind. |
OSPL-1144 / Case 00010809 |
Files generated by idlpp begin with a comment The IDL Pre-processor idlpp adds a header to each generated file. The contents of the template file 'fileHeaderContents', located in the directory 'Common' in OSPL_TMPL_PATH, will be added to the start of each generated file. This template can be modified to customize the header contents. This feature can be disabled by an idlpp command line option. Solution: |
OSPL-12287 / 00019125 |
Strange output for mmstat Some number were very large instead of being negative numbers Solution: The solution was to change the format of the printed number from unsigned to signed |
OSPL-12861 / 00019621 |
Allow hyphen in topic name The hyphen character ('-') is not allowed in topic-names according to the DDS V1.2 specification but due to an error is listed as a valid character instead of the underscore('_'). Solution: Since the specification is confusing and there's no real reason not to allow it, the hyphen was added as a valid character as long as it does not appear at the start or end of a topic-name. |
Report ID. | Description |
---|---|
OSPL-3350 |
Query ‘like’ expressions on a topic with more that one key can crash with a segmentation violation. In case that the like expression addresses a key field on a topic with multiple keys and the key field is not the last defined in the topic key list then the internal query logic makes a mistake in resolving the key value leading to a segmentation violation. Solution: The bug in the query logic is fixed and tested and now works as expected. |
OSPL-10460 |
The idlpp pre-processor does not report invalid annotations. The idlpp pre-processor first removes all comments before handling the type definitions. Thus annotations that are specified in comments (//@) are not parsed and thus not checked for validity. Solution: The comments are not skipped and annotations present in these comments are processed and validated. Further the syntax of the annotations is updated to the IDL 1.4 specification. |
OSPL-11154 / 00018509 |
The durability alignment may fail when the networking service detects another node but the communication is not yet reliable in both directions. When the networking service detects the presence of another node it will report that node alive. When the durability service detect this event and receives a first message from the corresponding fellow it will issue a namespace request. However the corresponding reliable channel over which this durability message will be send was not yet reliable at the other node. This may occur when there is much data loss at the receiving side of the other node (for example when socket buffers overflow). In that case the namespace request may be lost. The networking service will resend message at the moment it receives a first acknowledgement from the other node. The networking service maintains some backlog to prevent the first messages to be lost when the first acknowledgement is received. However when there is low traffic on this channel it may cause that the first messages send after detecting the presence of the other node are still lost. The durability service expects a reply as response to the namespace request as long as the fellow at which the request was issued remains alive for the durability service. Solution: A the moment the networking service detects the presence of another node it will report it alive but it will also start resending messages to that node until it receives a acknowledgement from that node or a timeout occurs and the node is reported as not alive. |
OSPL-11549 / 00018613 |
New idlpp backend for classic StandAlone C++ Up til now, idlpp did not have its own backend for the classic C++ language binding: it was forwarding this job to an external IDL compiler (cppgen) instead. However, cppgen was suffering from several issues: * Not able to handle unions with a large number of branches * Not accepting all legal union discriminator types * Not being able to handle recursion through the use of bounded sequences * Skipping generation of a typedef to another typedef in an included IDL file. Solution: A new classic C++ backend is built directly on idlpp that solves all of these issues. The new backend deprecates the use of cppgen, but if the new backend causes issues you can still choose to use cppgen instead by passing the command line parameter "-o deprecated-c++-mapping". |
OSPL-12362 |
Possible incorrect liveliness count on DataReader instances and application crashes when a DataWriter unregisters an instance. Three bugs are detected and solved: - When a DataWriter unregisters an instance and is deleted immediately afterwards the liveliness count on DataReader instances should be decreased once but due to an internal race condition it could be decreased twice and eventually become negative. - When a DataWriter unregisters an instance and more that one DataWriter exists for this instance the liveliness is not always decreased resulting in an instance that potentially never becomes not alive anymore. - When a DataWriter unregisters an instance, internally an invalid pointer is used that potentially can cause the application to crash. Solution: The race condition, liveliness counting issue and invalid pointer use are fixed. |
OSPL-12431 |
Entity enable only fails on the DataReader when its factory is not enabled. According to the specification the enable operation on an entity should fail in case its factory is disabled and return PRECONDITION_NOT_MET. Until now only the enable on the DataReader behaved according the specification, all other entities would enter the enabled state and return OK. Solution: Checking of the factory enable state is added to Subscriber, Publisher and DataWriter so from now on the enable operation behaves as specified for all entities except for Topics. |
OSPL-12505 |
Potential memory leakage and incorrect liveliness counts. Some internal issues can potentially cause incorrect liveliness awareness and lead to leakage of DataReader instances and incorrect DataReader liveliness counts in instances. Solution: Detected internal issues are fixed. |
OSPL-12682 |
When data without valid content (e.g., a dispose) is forwarded to a reader and the internal instance cache pipeline is bypassed, readers may drop the data. Data that is published should end up at interested readers. Internally, a mechanism called the 'instance pipeline' is used to achieve fast access to the required reader instance to deliver the data. In some occasions the instance pipeline is bypassed, in which case a lookup for the instance occur. In order to lookup an instance the keys for the instance must be provided. If the keys are not provided, then the reader cannot find the required instance. In the code there exists a path where alignment data without valid data (e.g., a dispose) but with keys is received, and this data is transformed to a message without any keys. If this message is forwarded to the reader and the cache is bypassed, then the data does not end up at interested readers. This could occur in situations where a dispose is aligned. Solution: If the pipeline is bypassed, the message is not transformed anymore to a message without keys, but the full-fledged message will be used. |
OSPL-12755 / 00019529 |
Memory leak in durability kv-store in sqlitemt mode The sqlitemt mode of the durability kv-store uses XML instead of CDR serialization to store a more human-readable representation of persistent data, useful for testing/debugging purposes. A missing free in this code caused memory to leak each time a sample is stored. Solution: The issue was fixed by adding a free in the relevant code branch. |
OSPL-12759 |
The idlpp preprocess should be able to maintain the full include directive in the generated files. When an idl file contains an include of another idl file then files generated by the idlpp preprocessor contain a corresponding include statement. However the include statement in the generated file will only contain the basename related to the included file. This may cause conflicts with source files which have the same basename. Solution: The optional option "-o maintain-include-namespace" is added to the idlpp preprocessor. When this option is specified the idlpp preprocessor which will maintain the corresponding include path as specified in the idl file. |
OSPL-12786 / 00019536 |
For c# unions containing only primitive types the idlpp generates incorrect code. To marshal a union between c# and the database representation the idlpp pre-processor generates an intermediate union representation to simulate the overlaying of the union cases. For this purpose a byte array is used. However a byte array in c# may not overlay an other type which causes a problem when the database representation has the union case at an offset smaller that the pointer size. For example when the union contains only primitive types. Solution: To marshal c# unions the idlpp pre-processor generates an intermediate type which either reflects the definition of the union when it only contains primitive types or a buffer to copy the c# union to or from which corresponds with the union size. |
OSPL-12799 / 00019578 |
Possible crash in durability service during termination After the work on OSPL-12648 was finished we discovered that there was another path in the durability termination mechanism that could lead to a crash. Solution: The order in which threads are terminated was rearranged to ensure this cash cannot occur anymore. |
OSPL-12838 / 00019602 |
Non-verbose error reports when key-file cannot be created When a shared-memory domain is started an key-file with metadata is created in a temporary directory (i.e. OSPL_TEMP or /tmp). If this directory doesn't exist or filesystem permissions don't allow creation of the file, an error report is created without including the path. Solution: The error report was extended to include path information |
OSPL-12840 |
Processing many alignment requests is time consuming When there are many nodes and many groups, an aligner may at some point in time be faced with many alignment requests that are stored in a queue. For logging purposes it is possible to log the contents of the queue every time a request is added to the queue. It turns out that this code to produce the logging information scales quadratically with the number of nodes and requests. Furthermore, this piece of code was excercised even when no log line was printed. This lead to unnecessary time consumption and slow performing alignment. Solution: The code is not being exercised by default anymore. |
OSPL-11262 |
Python Binding - Enabled reading of qos from Entity class, and reading of qos policies from Qos class The user of this API may want to read the qos policies that are set on an entity. Solution: The Entity class now has a 'qos' property, which is a Qos object. The Qos class now has getters and setters for each qos policy. |
OSPL-12765 |
Face message connection shall be a case-insensitive named entity The face connection name was not case-insensitive as a result of this a connection named Foo would differ from a connection named foo and give back different connection ids. Solution: The defect is fixed and a connection named Foo or foo will now communicate and also give back the same connection id. |
OSPL-12638 |
JavaScript API: Inconsistent QoS defaults for entities Depending on how entities are created, the QoS for the entities is set differently. For the reader and writer entities, if no argument is provided for QoS then topic's QoS is used. This is inconsistent with the creation of these entities with their default QoS values. Solution: Entities created without specifying an explicit QoS are now created as if a QoS.{entity}Default() value were passed. This is a change from previous releases and may result in the writer being 'reliable'. As a result, calls to write(), dispose() and unregister() must be changed to writeReliable(), disposeReliable() and unregisterReliable(). You must use the 'Reliable' variants of these functions because reliable data writers may block on the DDS system, and could thus block NodeJS JavaScript evaluation thread. The Reliable versions of these methods perform their processing on a separate thread, and indicate their success or failure via a returned promise. |
OSPL-12645 |
DDS Communication Status methods in Javascript API The Javascript API for DDS did not implement the DDS Communication Status methods described in the DDS specification. Solution: All DDS Communication Status methods have been implemented. |
OSPL-12597 |
Added ACE V6.5.0/TAO V2.5.0 to ORB abstraction layer for Classic Corba-Cohabitation C++ API. The ORB abstraction layer that comes with the Classic Corba Cohabitation C++ API hasn't been kept up to date for a while, and only contained ORBS that were pretty outdated by today's standards, some of which might even refuse to build on the newer Linux distributions. Solution: The latest version of ACE-TAO (ACE V6.5.0/TAO V2.5.0) has now been added to the ORB abstraction layer. |
Report ID. | Description |
---|---|
OSPL-12528 / 00019315 |
Dispose on synchronous writer blocking and returning timeout When an instance is disposed by a synchronous writer, the writer blocks until the dispose is acknowledged by relevant readers. An issue results in readers receiving the dispose without acknowledging resulting in writers blocking for the maximum blocking time and then returning a timeout result code. Note only dispose is affected, not write_dispose or any of the other write operations Solution: The incorrect behavior was caused by the synchronous flag on the dispose-message getting removed by mistake and an invalid sequence-number comparison, both on the writer side. |
OSPL-12637 / 00019334 |
The dbms connect service does not work with the latest MySQL version. The SQL syntax used by the dbms connect service to create the event table in the MySQL DBMS is no longer supported. This causes the creation of the event table and the corresponding trigger to fail. Solution: The SQL syntax to create the event table is update to use only one primary key which is auto increment. |
OSPL-12648 / 00019383 |
Possible crash in durability service during termination A number of threads in the durability service access shared data. Depending on the context, during termination the cleanup of this data in one thread can cause another thread to crash. Solution: The order in which threads are terminated was changed to ensure the data is cleaned up after all relevant threads have finished. |
OSPL-12666 |
The idlpp pre-processor crashes on a recursive data type definition When the data definitions contain a typedef of a structured type that is being used recursively, e.g. the structured type references itself, then the idlpp pre-processor crashes when generating the corresponding type descriptors. Solution: When generating the type descriptor the idlpp pre-processor maintains an administration of the type definitions which are currently being handled which enables to detect a recursion in the type definition. |
OSPL-12703 |
For c++ the idlpp preprocessor may report that it cannot find the include file "dds_dcps.idl The location of the dds_dcps.idl file has be changed. This may cause that the idlpp pre-processor cannot find this include file when compiling for the c++ language. The default include path used by the idlpp preprocessor points to the wrong location. Solution: The default include path used by the idlpp pre-processor is updated. |
OSPL-12706 |
Using mmstat -t on Window 64bit may causes a crash. When using mmstat -t on Windows 64bit the tool crashes when accessing the kernel which causes the spliced daemon to stop. The cause is the use of the ctime function which depending on the context expects either a 32 or 64 bit time value. Solution: The use of the function ctime is replaced which always uses the time in 64bit format. |
OSPL-12705 / 00019492 |
The cdr serialization of a topic with a recursive type definition may cause a crash .
When the type definition of a topic contains recursion then a crash may occur when a corresponding sample is cdr serialized. The offset of the fields of the recursive part are calculated incorrectly causing that memory is overwritten. This problem may occur in the following cases: storing the sample in a persistent store, alignment of the sample by durability or using recording and replay Solution: The cdr serialization of a recursive type is corrected. |
Report ID. | Description |
---|---|
OSPL-12650 / 00019222 |
The secure networking service reports that it received a message with invalid protocol-id. In case the secure networking service has to send more than one ACK message it may occur that the header of the ACK message becomes corrupted because it was overwritten by the previous encryption. This cause that the receiving node reports that is has received a message with an invalid protocol id and discards the message. This may cause a unnecessary retransmission. Solution: When more than one ACK message is send the buffer used to hold the ACK messages is reinitialized after sending the previous ACK message. |
OSPL-12693 |
The user-data field of the builtin DCPSParticipant topic may remotely be missing. When the ddsi service advertises a local participant it has to copy the QoS parameter of the corresponding participant QoS in the advertising message. This enables the receiving ddsi service to create an corresponding builtin DCPSParticipant topic. However in this case the user-data field is not copied at the sending side and thus will the created DCPSParticipant topic have an empty user-data field. Solution: The user-data field of the participant QoS is copied to the message which advertises the participant on the network. |
Report ID. | Description |
---|---|
OSPL-12660 / 00019363 |
Files created outside the configured OSPL_TEMP location The shared-memory monitor creates 'osplsock' files which do not adhere to the configured OSPL_TEMP env. variable. Instead the file is always created in /tmp. Solution: The code responsible for socket file creation is changed to prepend the value of OSPL_TEMP instead of '/tmp'. Note the fallback is still to use '/tmp' in case OSPL_TEMP is unset. On Posix systems a restriction of 108 bytes is applied to the length of the OSPL_TEMP due to the nature of unix sockets exceeding this result in an error message and OpenSplice will not start |
OSPL-12253 / 00019074 |
Networking service trace logs of different threads are interleaved The trace reports of the (secure) RTNetworking service on a busy system can be interleaved, i.e. two threads writing partial traces to the output file that end up on the same line. This decreases readability of the log and makes automated processing more difficult. Solution: The issue was resolved by no longer writing partial traces to the output file. There is a possibility of order reversal of reports by different threads though that should only be cosmetic, reports of the same thread will still be in order. |
OSPL-12496 / 00019225 |
Durability service fails to update status on termination The durability service needs to terminate when it detects an invalid namespace configuration. This normally occurs shortly after startup and follows a different termination path that fails to update the service-state. This in turn prevents spliced from taking appropriate actions. Solution: The durability code was fixed so the service-state is correctly updated before the service terminates. |
OSPL-12561 |
When two type definitions use an anonymous sequence of a type with the same name but in a different scope a crash may occur. When an anonymous sequence or array is created the a corresponding type definition is registered within the database which is used to allocate and free the anonymous sequence or array. This anonymous sequence or array type is registered using the name of the sub-type. However the name of the sub-type may not be unique because it may occur in different scopes. This may result that the wrong type is used to allocate the anonymous sequence or array. Solution: The full-name of the sub-type is used when registering the anonymous sequence or array type. Note that the full-name will be unique. |
OSPL-12564 / 00019319 |
The use of the C API generic read/write operation may cause a crash on ARM. For some type definitions the use of the generic read/write operation provided by the C API may cause a crash on the ARM platform. For sequence types the data is stored differently in the database as on the API layer. On the API level a sequence has the additional field _length, _maximum and _release. The generic copy routines that are generated from the XML type definition correct for the different sizes of the sequence stored in the database and how it will be copied to/from the C API. However on the ARM platform a misalignment may occur when the sequence type is followed by a data type that has to be aligned on a 8 byte boundary. Solution: The data structures used by the generic copy routines are updated to include information to correct the alignment differences between the database representation and how the data is represented at the C API. |
OSPL-12581 / 00019361 |
Java OSGi support broken Since V6.9.2 when the building of the Java(5) APIs was changed to maven the dcpssaj-osgi-bundle.jar file was not OSGi compliant. Solution: The following jar files are now OSGi compliant: dcpscj.jar, dcpssaj.jar and dcpssaj5.jar. These OSGi jar files can still be used as 'normal' jar files. No extra separate OSGi jar files needed. |
OSPL-12621 |
Type error in isocpp/isocpp2 xtypes interface When trying to use the xtypes interface of isocpp or isocpp2 an compilation error occurs on enum type. This should be enum Type in the TypeKind_def. Solution: Fix the type error. |
OSPL-12672 / 00019446 |
SuspendedPublication link error on windows When creating a SuspendedPublication object using the ISOCPP2 API on Windows results in a link error. Solution: The link error on windows is resolved. |
OSPL-12548 |
The DDSI2 service may establish secure TCP connections using TLSv1.1 instead of the more secure TLSv1.2 When establishing a secure TCP connection the DDSI2 networking service uses less secure version 1.1 of the TLS protocol instead of the more secure 1.2. Solution: The initialization of the secure TCP connection is changed to allow only TLSv1.2. |
OSPL-12572 |
The serializer used to convert data to CDR format does not handle large array correctly. When the type definition of a data type contains a large array containing more that 2^23-1 of primitive types then the CDR serializer is not able to convert this data. Internally the serializer uses an upper limit of 2^23 when serializing an array containing primitive types Solution: To convert data that contains an array with more than 2^23-1 primitive types the CDR serializer creates a loop of iterations to serializer successive parts of the large array. |
Report ID. | Description |
---|---|
OSPL-11809 / 00018836 |
Possible crash after detaching domains in a multi-domain application A second case was discovered in the user-layer code that protects access to the kernel (see previous OSPL-11809 release note). A small window exists that allows a thread to access kernel memory while the domain is already detached. Solution: The previous fix involved protecting access for threads leaving a protected (kernel) area. Entering the kernel also contained an issue, fixed by storing the relevant data in the user-layer so it can be accessed safely. |
OSPL-12135 / 00019020 |
Python DCPS API - Incorrect struct align and padding logic for dynamic (de)serializer. There were certain cases observed with certain topic types where data corruption within the sample payload occurred on reading data with the Python API coming from remote writers. The cause of the corruption was deserializer was populating the Python class for the topic data type from the incoming byte buffer with incorrect struct padding information. Solution: The calculation for struct padding has been corrected. |
OSPL-12271 / 00019111 |
Database mapping-address overlap on 64-bit ARM The 64-bit ARM (aarch64) builds used a default mapping address normally only used on 32-bit systems (0x20000000). Solution: To decrease the chance of address overlap, the default was changed to match the address used on 64-bit Intel-based systems (0x140000000). |
OSPL-12276 / 00019117 |
Error in the generated python object For statically generated Python code (via idlpp -l python), if an IDL struct is defined with only a single field then the generated member of '_member_attributes' becomes a string instead of a tuple. This happens because ('name') is interpreted as a string, whereas ('name1', 'name2') is a tuple of two strings. Solution: The '_member_attributes' in the generated python code now has an additional comma at the end. Therefore, now the '_member_attributes' will be interpreted as a tuple even if the IDL struct contains a single field. |
OSPL-12280 / 00019122 |
OSPL Tuner shortcut for Create Reader-Writer | Existing Partition should not be CTRL-A The shortcut for for Create Reader-Writer | Existing Partition does not work when it is set to CTRL-A, as this shortcut already has the default behaviour of 'select all rows' associated with it. Solution: The shortcut has been changed to CTRL-B. |
OSPL-12330 |
Python API segfault crash when receiving a disposed instance When subscribing with an application using the Python API, if a topic contains a string field, and a reader on that topic receives a state update (as in valid_data == False), then while deserializing the sample a segfault crash occurs due to attempting to dereference the null pointer for the string field. Solution: Fixed the null pointer dereference when deserializing string fields. |
OSPL-12441 / 00019190 |
When a candidate master becomes non-responsive during master selection, durability may not reach the COMPLETENESS state When a durability service starts, one of the first things it does is look for other durability services (fellows). If there are any fellows, then the durability services will negotiate which one them will acts as the master aligner for a namespace. In case there is a fellow that becomes non-responsive while negotiating mastership, then other durability service still try to elect this fellow as master. Because the fellow is non-responsive this will fail. This causes the durability service to never reach the COMPLETENESS state Solution: When a fellow becomes non-responsive while negotiating mastership, the fellow is excluded as potential master. |
OSPL-12442 |
Python exception when reading sample containing sequence of Enum field. Using the Python API, when reading samples from a topic registered via find_topic, or GeneratedClassInfo.register_topic, and the topic type contains a sequence of enum type, reading a sample fails due to exception raised while deserializing the field. Solution: A missing conditional to handle the sequence of Enum case was missing. It was added. |
OSPL-12454 |
Similar conflicts are not always combined, which may potentially lead to slow or failing alignment. The durability service is responsible for keeping states consistent. Whenever an event happens that requires the durability service to take action (e.g., a disconnect/reconnect) a so-called conflict is generated that needs to be resolved. Under certain circumstances multiple but similar conflicts can be generated. Because these conflicts are similar it is sufficient to only resolve one of them. However, due to a bug is was possible that multiple similar conflicts are generated and resolved sequentially. In particular, it was possible that multiple conflicts are generated with a rate that effectively causes the conflict queue never to become empty. Because the durability service only advertises group completeness when the conflict queue is empty, this could effectively lead to stalling alignment. Solution: The algorithm to decide when conflicts are similar is changed, so that similar conflicts are now being discarded. |
OSPL-12525 |
Idlpp python codegen fails to compile IDL containing const value of typedef type. When using idlpp for python code generation, if the IDL contains a const value whose type is a typedef, like the following: typedef int myint_t; const myint_t x = 0; Then idlpp exits immidiately with error code 1 and the printout "should not get here. idl_type == 1". Solution: The case for handling const definitions whose type is a typedef is now handled. |
OSPL-12526 |
[JavaScript] Potential errors reading from topics with sequences On reading a sequence, the DDS JavaScript API receives two integer values from the underlying DDS system: the length of the sequence read, and the maximum length of the buffer containing the sequence. The JavaScript API erroneously assumed that these two values were always the same, and erroneously made use of the returned maximum value. This could result in the JavaScript API including more sequence elements than a sample actually contains. Such behaviour can be seen if your code performs two or more read/take operations, with the first operation returning a sample with a larger number of sequence elements and the second operation returning a sample with a smaller number of sequence elements. The sample from the second operation would then erroneously include extra sequence elements from the first sample. Solution: The DDS JavaScript API has been corrected to distinguish between sequence length and the maximum sequence buffer size, and to correctly use the actual sequence length when reading. |
OSPL-12541 / 00019322 |
Missing dispose messages for late joining readers In the scenario where a late joining DataReader receives historical data from a deleted DataWriter with the auto dispose QoS policy set. The DateReader expects to receive the historical data in the state NOT_ALIVE_DISPOSED. However tests showed that since V6.10 the state has changed to NOT_ALIVE_NO_WRITERS. Solution: The problem is introduced by combining the Unregister and Disposed message as optimization but where the Unregister got preference above the Dispose. A fix is applied so that the Dispose is processed correctly. |
OSPL-11674 |
Simulink: Optional output directory argument to Vortex.idlImportSl Previous releases of the Simulink integration for OSPL did not allow specifying an output directory when the Vortex.idlImportSl() processed an IDL file. Solution: An optional argument has been added to Vortex.idlImportSl. If the outputDirectory parameter is provided, output from the IDL processor will be placed in the specified directory. Otherwise, current behaviour will be maintained, and the IDL process output will be placed in the current working directory. |
OSPL-12074 |
NodeJS DCPS API - New convenience function added to convert seconds to nanoseconds The DCPS api has functions that expect a number of nanoseconds as a parameter. This can be inconvenient to specify number of nanoseconds. Solution: A new convenience function was added so that users can easily convert seconds to nanoseconds. |
OSPL-12152 |
Python API missing factory method to create a topic The Python API for DDS did not include a {{DomainParticipant}} method to create/register a {{Topic}}. Instead, users had to directly invoke the {{Topic}} constructor. This was inconsistent with the way the API deals with the creation of child entities. Solution: A {{create_topic}} method has been added to the {{DomainParticipant}} class. Customers should prefer this method to calling the {{Topic}} class constructor directly. |
Report ID. | Description |
---|---|
OSPL-12135 / 00019020 |
Python DCPS API - Incorrect struct align and padding logic for dynamic (de)serializer. There were certain cases observed with certain topic types where data corruption within the sample payload occurred on reading data with the Python API coming from remote writers. The cause of the corruption was deserializer was populating the Python class for the topic data type from the incoming byte buffer with incorrect struct padding information. Solution: The calculation for struct padding has been corrected. |
OSPL-12196 / 00019046 |
Memory leak when client durability client sends unanswered request When the client durability client sent out a request and it did not receive a response the request was kept indefinitely which caused memory leakage. Solution: Added a garbage collector to client durability client which removes unanswered requests after ~35 seconds |
OSPL-12328 / 00019137 |
Unions in C# may be misaligned on some platforms Depending on the exact contents of a an IDL union, its C# representation might be misaligned on certain platforms. For example, on some 32 bit platforms a union with a 64 bit branch would not align that branch on 64 bit (as required on certain 32 bit platforms), but rather on 32 bit. Solution: The algorithm to determine alignment has been corrected to now be based on the worst case alignment requirement of all its branches. |
OSPL-12342 |
Crash of the spliced service when using OSPL_LOGPATH. Possible crash of the spliced service (in shared memory configuration) or application (in single process configuration) when the the user sets the environment variable OSPL_LOGPATH. Solution: Crash was caused by the spliced which tried to free the memory of the variable returned by the getenv operation. It is not required and not allowed to free the returned variable. The problem is solved by removing the free from the code. |
OSPL-12348 / 00019160 |
SuspendedPublication compile error When creating a SuspendedPublication object using the ISOCPP2 API results in a compile error. Solution: The error is fixed and the SuspendedPublication object can now be used correctly. |
OSPL-12349 / 00019164 |
OSPL_LOGPATH environment variable not used Due to a comparison bug, the value of the OSPL_LOGPATH environment variable is rejected, while it contains a valid string referring to a directory with write permission. Solution: The comparison bug was fixed so only values referring to an invalid path and/or directory with insufficient permissions is rejected |
Report ID. | Description |
---|---|
TSTTOOL-504 / 00019037 |
Tester freeze when creating reader on transient-local topic When Tester creates a data reader, the qos used for durability/reliability is always locked to volatile/best-effort, while the writer gets all the inherited topic qos or user qos options. Also, waitForHistoricalData was called by default when the reader is created, unless explicitly disabled in reader creation options. There was a case where the combination of volatile/best-effort qos, and waitForHistoricalData would cause the Tester to freeze for a long period of time when the topic to be subscribed to has transient-local durability. This was due to the waitForHistoricalData blocking forever until timeout. Solution: The approach taken to disallow Tester from calling waitForHistoricalData when dealing with volatile topics, and on the other hand allow created readers to be not be locked into volatile-best-effort, if the user explicitly chooses qos. In summary, readers created with default settings will be locked into volatile, best-effort, and now newly enforced order-by destination timestamp. Readers created with explicit settings will take on whatever qos is selected and will waitForHistorical if that option is selected. Writer creation is unaffected. |
OSPL-12252 / 00019073 |
Out of memory due to memory leakage. Various memory leaks in shared memory caused out of memory errors when calling API functions. Solution: Memory leaks are fixed. |
OSPL-11760 / 00018777 |
OpenSplice signalhandler deadlocks when crash in malloc. The OpenSplice signalhandler deadlocks when a SEGV occurs in the malloc system call as the signalhandler did allocations for reporting. Solution: Removed allocations from reporting when called by signalhandler and added a 60 second timeout on handling of synchronous signals after which the process is terminated. |
OSPL-11976 / 00018930 |
Durability service is reported DIED when resending for more the than heartbeat expiry time. When the durability service starts resending samples and isn't able to successfully write the sample before the heartbeat expiry time expires the service is reported DIED. When durability starts resending samples some of it's threads were holding a lock while continously trying to write the sample, this resulted in the renewal thread to block and not renew the lease. Solution: Release the locks before resending. |
OSPL-12194/OSPL-12195 |
Python/Javascript Binding IDL processing errors in OpenSplice Evaluation version. When processing idl files with the Python/Javascript binding with an evaluation version of OpenSplice an error occurs due to the EVALUATION VERSION idlpp printout Solution: The evaluation version of the Python/Javascript binding now correctly processes IDL files. |
OSPL-12170 / 00019038 |
Memory Leak concerning v_groupInstance. When writing and disposing instances it can happen that the v_groupInstance count grows where it should remain stable. Solution: The leaking v_groupInstance is fixed and the count will remain stable again. |
OSPL-12712 |
[Simulink] Simulink Coder support for cross-compiling to another target When using Simulink Coder to cross-compile a model to a OpenSplice target, the Simulink model needs access to the host environment OpenSplice executables, includes and libraries as well as the target environment OpenSplice libraries. Solution: To support cross-compilation, the VortexDDS Block Set now supports the LINK_OSPL_HOME, which should be set to the home directory of the target OpenSplice installation. During Simulink Coder compilation, a warning will be printed on the console indicating that LINK_OSPL_HOME is being used. For Simulink Coder builds targeted at the host OpenSplice installation, LINK_OSPL_HOME should not be set. |
OSPL-12219 |
[Simulink] Calculation of model-relative paths for QoS profiles occasionally incorrect To enable model portability the DDS block set for Simulink attempts to start references to QoS Profile XML documents as relative to the simulink model. However, in cases where the Simulink model included several subsystems, across subdirectories, or in cases where the QoS file was not in a subdirectory of the Simulink model, the calculation was incorrect. Solution: QoS Profile paths are now calculated relative to the 'root' Simulink model. In the case where the QoS Profile document is not contained directly or indirectly in the folder tree containing the root model, '../' elements are added to the QoS Profile path, in order to ensure a correct relative path. Note that, as with previous releases, when executing models (or programs generated from models via Simulink Coder), QoS Profile XML files are located relative to the current working directory when the execution starts. |
OSPL-12218 |
[Simulink] Global variable 'domain' being created by Participant block Using the DDS Participant block in a Simulink model caused a global variable 'domain' to be created. Solution: The variable is no longer created. |
OSPL-12118 |
Potential crash when reading data from persistent XML store When a durability service it may inject persistent data from its store (when available). In case the persistent data was stored in an XML store, then the pathname of the file that contains the store is dynamically allocated. The current implementation assumes that folder separation symbols occupy a single character. This is not true for some operating systems. In those cases too little memory is allocated. This may lead to a crash. Solution: The pathname now uses a operating system dependent file separator. |
OSPL-12090 / 00018967 |
Listener is not triggering on a built-in topic reader. When using the Java(5) API and when a listener is attached to a built-in topic reader the listener is never triggered. Solution: The defect in the built-in topic mechanism is solved and a listener attached to a built-in reader will now trigger as expected. |
OSPL-12085 |
Missing methods in C# Tutorial example. The C# Tutorial example contains a messageboard executable who uses a ExtDomainParticipant class. This class was missing 2 methods. Solution: The missing functions are added to the ExtDomainParticipant class |
OSPL-12078 / 00018965 |
When the namespace state has been reset and a native state conflict occur, the durability service can crash. Whenever a master for a namespace advertises a new state, slave nodes generate a native state conflict. To resolve such conflict, slave nodes will start an alignment from its master. In case the namespace state of the slave has been reset just before the native state conflict is resolved it is possible that the durability service service crashes due to an invalid deference. Solution: The invalid deference has been fixed. |
OSPL-12042 |
Build error when building from source on macOS When building from source building on macOS of ospl_uniqueID failed due to not yet existing directory Solution: Directory created as part of build process |
OSPL-11974 |
Idlpp ignores illegal enum idl construction. When using multiple enums in the same idl module and in those enums the same labels are used idlpp does not produce an error as this is not valid idl. Solution: The problem is fixed and idlpp now reports the invalid idl construction. |
OSPL-11971 / 00018921 |
Reading an invalid sample may cause infinite triggering loop. Invalid samples are meant to communicate an instance state change in case of the absence of user samples. Their only purpose is to communicate the instance state change, and when that has been done they serve no further purpose and therefore they are no longer be visible on subsequent read/take actions. However, if an invalid sample is read (not taken), it stays behind in the reader where it may cause a Read/QueryCondition on a SampleState of ANY to continue to trigger. However, since any subsequent read/take operation will ignore the invalid sample, there is no way to to get rid of it causing the Read/QueryCondition to spin indefinitely. Solution: The read operation now destructively takes invalid samples out of the reader cache, thus avoiding the Read/QueryCondition to spin on them. |
OSPL-11957 |
Shared memory consumption may increase after many restarts of nodes in case coherent transactions are involved, leading to potential shared memory exhaustion. The persistent store maintains End-Of-Transactions (EOT) messages that are published when a transaction is completed. An EOT should be removed from the persistent store if no samples belonging to the EOT are referenced anymore, but this was not happening in all cases. As a result the persistent store maintains too many EOTs. When these messages are injected after a restart, they will never be removed anymore. Over time, this may cause shared memory depletion. Note: this issue relates to OSPL-11941 Solution: EOT messages are now correctly removed from the persistent store. |
OSPL-11928 / 00018882 |
Tracing Verbosity uppercase values accepted but not recognized by RTNetworking. When using RTNetworking and setting a Tracing Verbosity value in uppercase the values are accepted by the configuration checker but not recognized by RTNetworking configuration parser. Solution: The defect is fixed and now upper and lowercase accepted values are recognized by the RTNetworking configuration parser. |
OSPL-11786 / 00018791 |
Generating dead code with IDLPP from an IDL containing a string definition. When using an idl containing a string definition and generating code for the C language dead code is being generated. Solution: The defect in the generating algorithm is fixed and correct code is now generated. |
OSPL-11532 |
Python API: get_key method implemented for DataReader and DataWriter Previous releases of the Python API for DCPS did not include implementations for the get_key method on DataReader and DataWriter classes because an underlying DCPS error would have caused these methods to fail. (The get_key method accepts an 'instance handle' and returns an object instance with the key fields initialized to the appropriate values for the instance handle. All other field values have default values.) Solution: The get_key method is now available. |
OSPL-11506 |
To save network bandwidth it is now possible to start alignment only when the topology has become stable. As soon as a durability service joins an existing system alignment may occur between the durability service and its master. When the topology changes while the alignment is going on (e.g, new subsystems joining) the alignment itself may not be necessary anymore (e.g., because somebody else has become master). Unfortunately, there is no way cancel ongoing alignments. To prevent such potential unnecessary alignments it is more efficient to start alignment when the topology is "stable", meaning that no topology changes have been detected for some time. This increases the chances that all nodes have selected the same master, which may lead to less alignments. The optional configuration option //Opensplice/DurabilityService/Network/Alignment/Topology/Stable can be used to specify the stability period (defaults to 0.0. seconds). To prevent that no alignment will ever occur in deployments where the topology is never stable, a maximum can be provided using the //Opensplice/DurabilityService/Network/Alignment/Topology/Stable[@max] attribute (which defaults to -1.0, meaning infinite). When the maximum time has expired and the topology is still not stable then alignment will occur anyway. Solution: It is now possible to wait for alignment if the topology has become stable. |
Report ID. | Description |
---|---|
OSPL-12211 |
Durability service with legacy master selection not directly slaving to existing master A durability service with the highest fellow id and an unconfirmed master starts legacy master selection it fails to slave to the existing master until it start majority voting Solution: Existing master selected sooner |
OSPL-12133 / 19018 |
After reconnect durability not aligned when fellow returns within 2*heartbeatPeriod and legacy masterselection algorithm is used When a fellow connects to a durability service a connect conflict is created. When this happens within twice the heartbeatPeriod after a fellow got disconnect the durability service potentially lost its master. When it lost master the connect conflict is discarded because no master selected, this could lead to inconsistent data states. Solution: Connect conflict is not discarded when no master selected. The conflict remains until a merge-action is applied |
Report ID. | Description |
---|---|
OSPL-11873 |
Possible application crash when using OpenSplice as windows service. In V6.9.2p2 a fix was made for the issue that when OpenSplice was used as a windows service and when an application was started an access violation could occur. Unfortunatly a part of the fix was not put into the V6.9.2p2. This is now corrected. Solution: The complete solution is now added and the crash should not occur anymore. |
Report ID. | Description |
---|---|
OSPL-12016 |
Logrotate not working correctly with OpenSplice log files. When using logrotate with the RTNetworking service the rotating works ok but the truncate function does not work properly. Solution: The problem is fixed and the RTNetworking service now can be properly truncated. |
OSPL-12007 |
When the idl contains a union with an unsigned short as discriminant, then serialization may fail on big endian hosts. If the idl contains a union with an unsigned short discriminant (and only an unsigned short), the constants for the case labels are represented in the OpenSplice metadata as a different type than the type of the discriminant itself. In the process of converting the type into instructions for the serializer VM, the case labels are read from memory as-if their representation is that of the discriminant type (even though the values carry a type indication themselves, and this type indication is correct for the actual value). On a big-endian machine, it therefore reads all case labels as 0. Consequently, any case that differs from 0 will be handled as case 0. This will lead to an invalid serialization. User may experience this as a communication problem. Solution: The constants for the case labels are now correctly represented in the OpenSplice metadata. |
OSPL-12010 |
Tester, Tuner, and Jython scripting fail to write samples containing sequence of enum When writing a sample with any of the above tools, whose type contains sequence of enumeration type, then any readers will receive the sample with the sequence empty. This is due to an error in the writing side serialization for sequence of enum types. Solution: Fixed the serializer in the underlying api to account for sequence of enum types. |
OSPL-11965 |
Possible crash in durability termination when handling an event The durability service could crash during termination when handling an event in the "AdminEventDispatcher" thread during termination as the "conflictResolver" thread was stopped first and the events could be dependent on this threads administration Solution: Stop "AdminEventDispatcher" thread before "conflictResolver" thread |
OSPL-11873 |
Possible application crash when using OpenSplice as windows service. When using OpenSplice as a windows service under a different user as the application a possible application crash can occur due to a permission problem. An indication of when this is occurring can be when the following message occurs in the error log: "Failed to allocate a wait handler on process handle, System Error Code: 87" Solution: The permission problem is fixed and the user application can now correctly communicate with the service. |
OSPL-11829 |
RnR service property file incompatibility When using the RnR service on windows the property file thats being created has windows line terminating endings (^M) When using this propery file under linux this can cause problems. Solution: The problem is fixed and the property file incompatibility is now gone. |
OSPL-11809 |
Possible crash in write after detach_all_domains It was possible that a multithreaded application would crash when calling detach_all_domains and a other API call that would access the domain concurrently. I was possible that the other API call accessed the domain while it was detaching which caused the crash could crash when it called a DataWriter_write function after it called detach_all_domains because Solution: No longer possible to access the domain while detaching |
OSPL-11807 |
No SSL support for ddsi in Community Edition According to the documentation, it should be possible to configure ddsi to use SSL, even in the Community Edition. Also the ddsi code base in the Community Edition contains everything needed to support SSL. The only reason why SSL support is not available in the Community Edition was that the build files were not explicitly including this feature by passing the appropriate macro and including the required SSL libraries. Solution: The build files have been modified: if SSL support is available on the build platform, SSL support will be enabled by default on the Community Edition. |
OSPL-11778 |
Merge state incorrectly updated when last partition.topic in namespace did not change while others did OSPL-10113 addressed an issue where merging with an empty set could lead to a state update even when a data set was not changed. It only looked at the last aligned partition.topic in the namespace for changes in stead of all partition.topics part of the namespace and thus something didn't update the namespace state while it had to Solution: Look at all partition.topics in the namespace to determine if the namespace state needs to be updated |
OSPL-11690 |
A booting node with an aligner=FALSE policy may not reach the complete state in case it has not detected all namespaces from its aligner in time. If a node has a namespace with an aligner=FALSE policy then the node should only reach completeness in case an aligner becomes available. Due to an unlucky timing it is possible that the node detects an aligner but has not yet received all of its namespaces. In that case the node will not request the completeness state of the group (i.e., partition/topic combinations) from its aligner while traversing its initial startup procedure. Because it does not request the groups from the aligner, the node will not known that the aligner has all its groups complete, and therefore never requests data from the aligner. This causes the node to never reach the completeness state. Solution: Group information is now always requested from aligners if not done so already. |
OSPL-8553 |
Shared memory java applications not working on windows 10 64 bit with Java 8 When using Shared memory with windows 10 64 bit and Java 8 it can happen that java applications wont connect to the shared memory. Solution: The OpenSplice default shared memory address for windows 64 bit is now changed to 0x140000000. The old default 0x100000000 shared memory address was in the memory range of the Java 8 jvm. |
OSPL-11913 |
Dynamic namespace not learned when only received once The durability service is able to learn namespace published by fellow durability services, these are called dynamic namespace. Dynamic namespace were only learned by the durability service when the fellow was in approved communication state and this state was only reached when all namespace were received. Solution: When the fellow reached approved communication state the received dynamic namespaces are now learned. |
OSPL-11540 |
When client durability and ddsi is used between OpenSplice nodes alignment of historical data may fail if relevant readers are not discovered in time. One way to obtain historical data is using client durability. When ddsi is used as networking service, a precondition of a durability service to sent historical data is that the client's historicalDataReader must have been discovered. By default, a durability client in OpenSplice sends requests with a timeout of 10 ms, so that a server will answer the request within 10 ms. However, in case the server has not discovered the historicalDataReader from the client within this time, the server sends an error message. Consequently, the client will not receive the data. This situation typically appears when the server is unable discover the cleint's reader on time, e.g., when network latencies are "large". Solution: Whenever the server receives a historicalDataRequest from a client and the remote readers have not yet been discovered the client's historicalData reader, then the server will sent when the reader is discovered or send an error notification when 30 seconds has passed, whatever comes first. |
Report ID. | Description |
---|---|
OSPL-11708 |
Out of order delivery of invalid message might corrupt instance state In the V6.9.1, a bug was introduced where out-of-order delivery of an invalid message might corrupt the instance state, resulting in undefined behavior. Out-of-order delivery occurs when your Reader is using BY_SOURCE_TIMESTAMP ordering and a message with a source timestamp older that the currently stored message is being delivered. Solution: The ordering algorithm has been modified so that this scenario no longer corrupts the instance state. |
OSPL-9054 | Leakage of JNI local refs when using the Java API When using the Java API and taking a lot of samples at once it could happen that the JNI local refs exceed the JNI local ref limit. Solution: The leakage is fixed and will not occur anymore. |
OSPL-11671 | Missing initial dynamic network
configuration state When using RT Networkservice in combination with dynamic network configuration it could happen that the initial publication of the dynamic network configuration state would not happen. Solution: The problem is fixed and the initial dynamic network configuration state is now always published. |
OSPL-11718 | The dcpsPublicationListener is not freed correctly when durability terminates When client durability is enabled, the durability service creates a reader that listens to DCPSPublication messages. When durability is terminated this reader was not cleaned up correctly, which could lead to a memory leak. This problem can only surface when client durability is enabled. Solution: The reader is cleaned up correctly. |
OSPL-11550 | Inefficient alignment in case of master conflicts for the same namespace with different fellows When nodes becomes temporarily disconnected, they may choose different masters to align from. When reconnected again this leads to a situation where there are multiple masters in the system. To resolve this situation a master conflict is scheduled that needs to be resolved. Resolving a master conflict leads to choosing a single master again, and in many cases leads to alignment between nodes. Currently, a master conflict is generated per namespace per fellow (i.e., a remote durability service). In case there are many fellows this will lead to many conflicts, and hence many alignments. It is not necessary to generate a conflict per fellow to recover from such situation. Solution: Whereas in the past two master conflicts for the same namespace with different fellows led to different conflicts that each needed to be resolved, they now lead to a duplicate conflict that is dropped. This decreases the number of master conflicts in case there are many fellows, and hence may decrease the number of alignment actions that will occur. |
OSPL-11531 | Similar conflicts can be resolved multiple times The durability service monitors the state of fellow durability services to detect any discrepancies between them. If there is a discrepancy, a conflict is generated to resolve this discrepancy. Before a conflict is added to the queue waiting to be resolved, a check is done to see if a similar was not already queued. If so, the newly generated will not be added to the queue of pending conflicts. What is not checked is whether the conflict that is currently being resolved is similar to the one that is generated. Consequently, if a conflict (say conflict 1) is being resolved and another conflict (say conflict 2) is being generated, and conflict 1 and conflict 2 are similar in the sense that they should resolve the same discrepancy, then conflict 2 may still be added to the queue of pending conflicts. This may cause that the same conflict is being resolved multiple times. Solution: Before a conflict is added to the queue it is checked whether the current conflict is similar. |
OSPL-11606 | Sample validation by DataWriter may cause compilation issues when enabled By default the ISOCPP2 DataWriter does not validate sample input against for example the bounds specified in IDL. In order to enable bounds checking, the macro OSPL_BOUNDS_CHECKING should be set when compiling the idlpp output. However, when setting this macro it might turn out that some of the generated code might not compile. Solution: Bounds validation has been corrected, and is now also enabled by default. This is because the importance of bounds checking far outweighs the tiny bit of extra overhead introduced by it. |
Report ID. | Description |
---|---|
OSPL-11632 |
The durability service may not become complete during after startup. When a durability service starts, this durability service needs to find out if it has become the master for one or more of its namespaces, and whether it needs to inject any persistent data that may be available. To determine all this, the durability service creates an initial conflict that needs to gather information such as namespaces and groups from fellow durability services in the system. Because fellows can enter the system at any time, there is small window where groups are gathered from a set of fellows, but namespaces are gathered from a different set of fellows. This can happen if the set of fellows changes between between the gathering of namespaces and the gathering of groups. Even though this is a tiny window, it can happen. Because a durability service will only respond to a request for groups if namespaces have been exchanged, it is possible that a durability service keeps waiting for groups indefinitely causes the durability service not to reach completeness. This issue has only been observed when masterPriorities are configured. Solution: The set of fellows that participate in the resolution of the initial conflict is now a stable set. |
OSPL-11753 |
Dispose all data not processed by remote nodes When doing a dispose_all_data call on a topic, local instances are disposed and a DCPSCandMCommand sample is published on the network, for other nodes to dispose instances of that topic as well. The sample is processed by the builtin-subscriber of spliced, which in turn disposes relevant instances. A bug in spliced, related to setting the event-mask, causes it to not wake-up when new data is available, and therefore no instances are disposed on remote nodes. Solution: The bug was fixed by properly setting the data-available event-mask. |
OSPL-11500 |
The mode to store data in human-readable format in the KV store was undocumented,
and trying to use it would lead to an error. The KV store is an efficient implementation of a persistent store that by default stores data as blobs that are difficult to interpret by humans. The KV store can be configured to store the data in human-readable format, but unfortunately this mode was not documented. Because only valid configurations can be used to start OpenSplice and this option was not documented, it was considered invalid. Therefore, it was not possible for a user to configure the KV store in human-readable format. Solution: The option to configure the KV store has now been documented and does not lead to an error any more. See //OpenSplice/DurabilityService/Persistent/KeyValueStore[@type] for more information. |
OSPL-11398 |
idlpp generated code for C# has several issues with respect to handling IDL
collections and unions. The idlpp backend for C# generates code that has several issues when dealing with IDL collections and unions. Examples of things that would fail to compile or fail to work were:
|
OSPL-11533 |
C99: dds_instance_get_key seg faults with generic copy-out method When using the C99 dds_instance_get_key function on a generic data writer, a seg fault happens in the generic copy out function due to a null pointer Solution: Added the code missing to copy data to the DDS_DataWriter_get_key_value so that it now works when using generic copy routines. DDS_DataWriter_get_key_value is called by the C99 dds_instance_get_key, causing the seg fault in the copy our routine |
OSPL-11774 |
Added <Verbosity> tag to the native NetworkingService <Tracing> configuration. Before the NetworkingService only supported tracing categories where for each category a trace level 0..6 could be specified. All other services provided a <Verbosity> tracing option where a trace level 0..6 could be specified. Solution: The new NetworkService <Verbosity> option is an additional way to set tracing levels besides the existing tracing category configuration option. The category options specify the trace level per category whereas the new verbosity option set the trace level for all categories. |
OSPL-11799 |
Dbmsconnect incompatibility with MS SQL 2017. When using the Dbmsconnect service replication functionality with MS SQL 2017 triggers do not function as expected. Solution: The fault in the replication functionality in dbmsconnect is now fixed. |
OSPL-11593 / 18657 |
Potential race condition between threads creating/deleting DomainParticipants. When an application has multiple threads that are creating and deleting participants into the same Domain in parallel, a race condition may occur between the thread deleting the last participant to that Domain (and thus implicitly deleting the Domain itself) and a thread that is creating a new participant to that Domain (and thus implicitly creating the Domain that is in the process of being deleted). Solution: The race condition has been removed by synchronizing threads that create and destroy DomainParticipants. |
OSPL-11707 / 18758 |
Representation of IDL constants not correct in many language bindings of idlpp. The representation of IDL constants in many of the language bindings of idlpp was incorrect. For example, negative numbers in in C/C++ could result in compilation warnings and in Java in compilation errors. Also the representation of an enumerated constant would result in compilation errors in ISOCPP2, C# and Java. Solution: Representation for constants has been corrected in all the above languages. |
OSPL-11124 / 18494 |
ISOCPP2 Topic proxies do not correctly handle their reference to the DomainParticipant. When creating topic proxies of type dds::topic::AnyTopic or of type dds::topic::TopicDescription using the dds::topic::disover_all() function, then releasing the participant before releasing the proxies themselves results in a NULL pointer dereference and a subsequent crash. Solution: All Topic proxies now correctly manage their reference to the participant so that they keep the participant alive for as long as the proxy is alive itself. |
OSPL-11289 / 18567 |
Idlpp generates ISOCPP2 code that may not compile for an Array with elements that are a typedef to some other type. When using the ISOCPP2 backend of idlpp to compile an IDL model that has an array with elements that are a typedef to some other types, the generated C++ code will not always compile with a C++ compiler. Solution: The ISOCPP2 backend has been modified to generate C++ code that does compile correctly for these types of IDL constructions. |
OSPL-11639 / 18694 |
Invalid memory access when caching is disabled on client-durability policies. An issue with memory allocation in the client-durability thread of spliced can lead to a buffer overrun resulting in undefined behaviour. The issue can be triggered by enabling client-durability and configuring a policy with disabled historical data caching. Solution: The code responsible for the buffer allocation was fixed. |
OSPL-11647 / 18694 |
Data-available and data-on-readers events not triggered when mask includes other events. When a mask is set to include i.e. liveliness or other events, in addition to data-available and/or data-on-readers events, the latter may not be triggered when a different event is queued and not processed yet by relevant listener(s) . This issue mostly occurs directly after creation of a DataReader for which an active writer exists. Creation triggers liveliness and subscription matched events, when a sample is delivered before these events are processed, the data-available and data-on-readers events are not raised. Note that in the C99 API, entities are created with an "ANY" mask by default, therefore it is much more likely to occur. Solution: The issue was fixed by improving the code responsible for merging events, taking into account different event types. |
OSPL-11703 |
Python binding: Segmentation violation when attempting to read/take topic data with zero-length sequences. For statically generated Python code (via idlpp -l python), a segmentation violation my result if topic data is read that contains a zero-length sequence. This could occur as the result of reading or taking a sample that corresponded to a disposed instance (in which only key fields are set), as well as with actual instances that have been written with an empty sequence. Solution: Empty length sequences are now handled correctly on read/take. |
OSPL-11601 / 18659 |
Compile warnings in Corba-Java code generated by idlpp. When compiling classes generated by idlpp in Corba-Java mode, with a recent compiler, warnings occur because certain classes, inheriting from a Serializable base-class, lack the required serialVersionUID member. Note that it is auto-generated if missing, so does not cause any runtime issues. It does trigger compile warnings if compiled with -Xlint:all. Solution: The serialVersionUID was added to relevant idlpp templates so that generated code can be compiled warning-free. |
OSPL-11651 |
The ISOCPP2 API had one file that was published using the LGPL license. The implementation of the dds::core::array was taken from the std::array and was therefore referencing the LGPL license under which it was released. For OpenSplice users that want no dependency on LGPL this could have undesired legal consequences. Solution: The code that was published under LGPL has been removed, and is replaced by references to the std::array, the std::tr1::array or the boost::array, depending on the context of the compiler. |
OSPL-11711 |
Possible deadlock in multidomain waitset during object destruction. When using a waitset in a multidomain scenario it was possible that the internal multidomain runner thread caused the application thread to deadlock when trying to delete an object attached to a waitset as the internal thread could miss the deletion trigger. Solution: Made the deletion trigger persist so that it can no longer be missed. |
OSPL-11815 |
IDLPP generating incorrect Python on Windows platforms. The IDLPP Python language binding (-l python) produces incorrect Python code when run on Windows platforms. The generated python includes numerous references to $s, which is invalid python syntax. Solution: Python generation on Windows has been fixed. |
Report ID. | Description |
---|---|
OSPL-11581 |
OpenSplice doesn't communicate with OpenDDS DDSI rejects all incoming RTPS messages which have a different RTPS version than the OpenSplice DDSI implementation, were it should accept messages with equal or newer RTPS implementation versions. DDSI now accepts incoming RPTS messages that have an equal major version and an equal or newer minor version |
OSPL-11651 |
The ISOCPP2 API had one file that was published using the LGPL license The implementation of the dds::core::array was taken from the std::array and was therefore referencing the LGPL license under which it was released. For OpenSplice users that want no dependency on LGPL this could have undesired legal consequences. The code that was published under LGPL has been removed, and is replaced by references to the std::array, the std::tr1::array or the boost::array, depending on the context of the compiler. |
Report ID. | Description |
---|---|
OSPL-11629 |
Missing historical data when using alignee is ON_REQUEST On wait_for_historical_data the reader generates a HISTORICAL_REQUEST event. When the durability service receives this event is sends out a historical data request. The durability service never received the event due to an error in the internal waitset implementation, it was not thread safe. Made the internal waitset thread safe |
OSPL-11595 |
Wrong master selected for federations with same master priority Master selection is done based on 3 variables, namely in priority order: master priority, store quality and systemId. I was possible that on federations with same master priority and no store the federation with the lowest systemId was selected as master because the store quality was set to a random value instead of zero. Set the store quality to zero when no store is used |
OSPL-11236 |
Instance purged before auto-purge delay is expired. When using an autopurge_nowriter_samples_delay or autopurge_disposed_samples_delay in a DataReader QoS, an instance can be purged before the delay is expired. The time comparison to determine if a delay is expired, uses the monotonic clock implementation. This means an invalid time-stamp can be returned if the "current" monotonic time is less than the autopurge delay. The monotonic time represents time since an unspecified starting point, which is often the time the system is started, so the issue occurs when the system uptime is less than the autopurge delay. Solution: The issue was fixed by checking for an invalid timestamp, before using it's value to calculate if the instance should be purged. |
OSPL-11559 |
Missing release note in V6.9.0 release
In our V6.9.0 release we fixed an issue and forgot to add a release note. We now added this release note for ticket OSPL-11502 to our V6.9.0 release notes |
OSPL-11503 |
Potential Deadlock releasing Python DDS entities The DDS Python API method that releases DDS entities (dds.Entity.close) would occasionally hang the main Python thread. This could only occur if a child entity registered a 'listener', and that listener gets triggered as part of closing the entity. Solution: The deadlock has been removed. |
OSPL-11268 |
IDLPP Python language binding ignored Constant declarations In release 6.9.0, the Python binding for IDLPP ignored Constant statements. Solution: In release 6.9.1, IDLPP now generates appropriate Python declarations. |
OSPL-11130 |
Python API: unable to create data readers and writers directly from a participant In release 6.9.0, the Python API did not support creating data readers and writers directly from a domain participant. Instead, the user had to create a publisher (data writers) or subscriber (data readers), first. Solution: The dds.DomainParticipant now has methods create_datareader and create_datawriter |
OSPL-11264 |
Python API: Simplified method to find and use existing topics In release 6.9.0, multiple steps were required to find an existing topic, and setup it up so that a Python program could read and write data from that topic. Solution: A new method, ddsutil.find_and_register_topic has been created. It is a one-step solution to finding and locally registering an existing topic. |
OSPL-11269 |
Improved type checking in Python code generated by IDLPP In release 6.9.0, IDLPP for the Python language generated code that had little type checking. It was possible to set field values to inappropriate data types and/or value ranges. Solution: Type checking in generated Python code has been improved. Python exceptions are now thrown if inappropriate values are set. |
OSPL-11271 |
Full support for listeners in DCPS API for Python The 6.9.0 release of the DCPS API for Python included support for the on_data_available listener, only. Solution: All listener methods have been implemented. |
OSPL-11280, OSPL-11279 |
Python API documentation and example location The native Python API for DCPS had documentation and examples stored as a ZIP file within OSPL_HOME/tools/python. Solution: To improve accessibility of this documentation, API documentation and examples are now in subfolders of OSPL_HOME/tools/python. The Python API user manual is located with all other OpenSplice manuals. |
OSPL-11513 |
DDS Communication Status methods in Python API The Python API for DDS did not implement the DDS Communication Status methods described in the DDS specification. Solution: All DDS Communication Status methods have been implemented. See the Python API documentation for details. |
OSPL-11512 |
Initialization of dynamically generated Python topic data instances In the Python DDS API, dynamically generated topic classes initialized all fields to None. This was both inconsistent with statically generated topic classes (created via IDLPP), but made such classes difficult to use. Solution: Fields in dynamically generated topic classes are now initialized to appropriate defaults: numeric fields to zero, strings to empty strings, sequences to empty lists, and arrays to appropriately sized arrays. |
OSPL-11521 |
Python API did not implement 'instance methods' The Python DDS API did not implement methods that accessed instance handles. Solution: Instance handle methods are now implemented. |
OSPL-11520 |
Python API did not support read_cond or take_cond The DDS Python API did not support data reader methods read_cond or take_cond. This made it difficult for users of the ReadCondition or QueryCondition classes to find appropriate data when such conditions were triggered. Solution: The read_cond and take_cond methods have been implemented. |
OSPL-11522 |
Python API sources not included in distribution Source code for the Python DCPS API was not included in all installers. Solution: The source code for the Python DCPS API is now included in all installers for Vortex OpenSplice. In this form, it is possible to compile and install the API on any platform for any version of Python above the minimum supported. Please see the README.txt in |
Report ID. | Description |
---|---|
OSPL-11502 |
When a durability service generates a nameSpaceRequest on behalf of a fellow, the fellow's state
can be incorrect.
A durability service may decided to cache a nameSpaceRequest from a fellow if it has not discovered all relevant protocol readers and received the capability topic from the fellow. Only when the all relevant protocol readers have been discovered and the capability topic has been received a durability service will answer the cached nameSpaceRequest. In answering a cached request the durability service wrongly uses its own state instead of the fellow's state. This may result in a wrong fellow state. Since the state of the fellow plays a role in alignment and master election this may impact alignment, and potentially cause alignment to fail. Solution: Instead of using the durability service's own state it now uses the state of the fellow on whose behalf the cached nameSpaceRequest is sent. |
OSPL-11245 |
License checks are inconsistent
Licensing of components is not consistently checked which may result in improperly counted licenses Solution: License checking has been improved NOTE: Please check you license before applying this update since the enhanced license may cause your application not to work |
OSPL-11160 |
Qos mismatch between OpenSplice and Lite using the C99 API
A Writer does not match a Reader and vice versa when the Topic QOS's are configured as "RELIABLE" and "TRANSIENT_LOCAL" this causes late joiners not to receive any samples although the Topic Durability QOS was set to "TRANSIENT_LOCAL" and the Topic Reliability QOS was set to "RELIABLE" on both sides. Solution: The QoS mismatch is fixed and OpenSplice and Lite now can communicate correctly using the C99 API |
OSPL-11199 |
Idlpp spins indefinitely when compiling an IDL data struct with indirect recursion
When you try to compile an IDL data model that has indirect recursion (i.e. a datatype has a reference to another datatype that eventually refers back to the original datatype), the IDL compiler starts to spin indefinitely, trying to walk through the recursive cycle over and over again. Solution: The algorithm used to handle recursive datatypes has now been modified to also support indirect recursion in a correct way. |
OSPL-11157 |
Installer asks for License file even if user declines to supply one
Installation process asks for License file even when user answers N to providing license file. Solution: The installation process will no longer ask for an existing license file if the user declines to supply one. |
OSPL-11028 |
Python language support for IDLPP
The Python language binding shipped in Vortex OpenSplice 6.8.3 did not support compilation of IDL into Python classes. Instead, the binding provided a method for dynamically creating Python classes, given an IDL file. While dynamic generation of Python classes is functionally equivalent to having IDLPP create Python code, source-code aware editors can provide better content assistance while editing if they have access to source code. Solution: IDLPP now supports a python language binding: idlpp -l python [other-idlpp-options] idl-file |
OSPL-11026 |
Using topics defined in other DDS applications
In Vortex OpenSplice 6.8.3, the Python binding for DDS did not allow a Python application to access a topic without having access to the IDL defining that topic. Solution: The Python binding now supports a mechanism for registering a topic found via DomainParticipant.find_topic as a local topic. A local topic is a topic for which locally defined Python classes exist. The process for creating a local topic from a found topic are illustrated in the following example: dp = dds.DomainParticipant() found_topic = dp.find_topic('OsplTestTopic') # defined by Tester local_topic = ddsutil.register_found_topic_as_local(found_topic) gen_info = ddsutil.get_dds_classes_for_found_topic(found_topic) OsplTestTopic = gen_info.get_class(found_topic.type_name) # proceed to create publishers, subscribers, readers & writers by referencing local_topic |
OSPL-11135 |
Inconsistent treatment of character data in Python binding
In the Vortex OpenSplice 6.8.3 beta of the Python binding, Python properties corresponding to IDL fields of type 'char' and 'string' were inconsistently treated. Sometimes, they would accept, expect or return a Python bytes value. At other types, a Python str (string) would be used. Solution: Treatment of IDL string and char values has been standardized as mapping to Python str values. You should always use a Python str when writing such properties, and always expect a str to be returned by such properties. For arrays or sequences of IDL char, the Python equivalent is a list of str, where each Python string is exactly one (1) character long. |
OSPL-11238 |
QoS parameter mandatory on some Python APIs
The beta version of the Python DCPS API (Vortex OpenSplice 6.8.3) included several methods where quality of service (QoS) parameters were mandatory. This was inconsistent with other methods, where you could rely on appropriate defaults being generated. Solution: All QoS parameters in the Python DCPS API have been made optional, and, if not provided, then an appropriate default is used. |
OSPL-11248 |
Python API has no way to explicitly delete entities
The beta version of the Python DCPS API (Vortex OpenSplice 6.8.3) did not provide a mechanism for explicitly deleting entities. Instead, all entities were release at the end of the Python session. Solution: A close method has been added to all entity classes (DomainParticipant, Publisher, Subscriber, Topic, DataReader and DataWriter), that will explicitly release the entity, and all it's child entities. |
OSPL-11248 |
Python support for DCPS built-in topics
The beta version of the Python DCPS API (Vortex OpenSplice 6.8.3) did not include support for built-in DCPS topics. Solution: Vortex OpenSplice 6.9.0 release includes support for the built-in topics. Because the DCPS topics are pre-registered in every domain, you may find that using 'over-the-wire' topic discovery to use these topics. See the documentation ddsutil.find_topic, ddstuil.register_found_topic_as_local and ddsutil.get_dds_classes_for_found_topic as well as the Python DPCS API Guide. |
OSPL-11496 |
Durability crash during termination
When durability is terminating it could crash when during this termination a heartbeat is received from another node. Solution: The crash during termination is fixed by changing the termination order of the durability threads. |
Report ID. | Description |
---|---|
OSPL-11388 |
Invalid or empty description in log reports produced by XML report-plugins. A bug in building the XML document that is forwarded to report plugins, caused the description element of the report to contain invalid XML syntax. Depending on the report-plugin implementation, this can cause i.e. a crash during XML parsing or outputting an empty description element. Solution: The bug was resolved so that proper XML data is provided to report plugins. |
OSPL-11241 |
Unable to start durability service on macOS. Unable to start the durability service on a macOS as the libleveldb-ospl.dylib had a build path set as library location. Solution: Build path replaced with @rpath. |
OSPL-11240 |
Java tools which require a license not working on macOS. When starting the tuner or tester on macOS a rlm licensing error is shown and the tools cant be used. Solution: The problem is fixed by changing the ospltun and ospltest scripts to include the rlm library. |
OSPL-11287 |
Race condition during fellow discovery that could prevent communication with
a fellow. When a durability service discovers another durability service (called a fellow) it first has to confirm the fellow before it is allowed to talk to the fellow. In the unfortunate event that first a status message is received from the fellow (causing the fellow to be confirmed) and later on remote readers from the fellow are discovered, the fellow becomes unconfirmed again. If the fellow sends a request during the short period that the durability service was confirmed, the durability service may never respond to the request if it has become unconfirmed when it receives the request. Solution: An unconfirmed fellow can become confirmed, but never the other way around. |
OSPL-11232 |
When a durability client requests historical data and the durability server is
still initializing the alignment may fail. When an application is started which uses client durability at the durability server is still initializing then this may cause that the durability client does not align correctly. For example when the durability client sends an historical data request for a group that is not yet known at the durability server because the server is still learning the local available groups the server responds by sending a response with no data and an error code that indicates that it is not responsible. The durability client does ignore the error code contained in the response and marks the group incorrectly complete. Solution: When a durability client issues a historical data requests and the durability service answers with a response indicating an error then the durability client ignores the response and issues a new historical data request when it receives a state change indication from the server. |
OSPL-11158 |
The builin topics are not aligned when client durability is used in combination with
RT networking. The client durability implementation does not align a number of builtin topics. This is not necessary when used in combination with the ddsi service because the ddsi service provides these builtin topics. However in case RT networking is used these builtin topics will not be available. This causes for example that the publication matched and subscription match statusses will not function properly. Solution: In case the RT networking service is configured client durability will also issue historical data request for the alignment of the builtin topics. |
OSPL-11150 |
A full network queue may cause that the durability service does not updates it's lease in
time. The durability service periodically writes a status message to inform the other durability services about it's state. When the network queue is full the write of the status message may timeout and the durability service will retry the write of the status message. This continues until the write of the status message is successful. However the thread that performs the periodically write of the status message is also responsible for updating the lease which informs the spliced daemon that the durability service is still alive. When it takes too much time to successfully write the status message it may occur that the lease is not updated in time which causes that the spliced daemon declares the durability service as died and performs the configured failure action. Solution: The function that writes the status message returns an error when the write of the status message times out instead of constantly retrying to rewrite the message. The error return will cause that a rewrite of the status message is rescheduled directly after checking if the lease should be updated. |
OSPL-11193 / 11194 |
Listeners called before DataReader creation is completed, resulting in 'not-enabled'
exceptions when accessing the DataReader. When a listener is supplied as a parameter when creating a DataReader (as opposed to setting a listener after creation of an entity), there's a possibility that a listener callback is executed by the listener-thread, before the DataReader creation has finished in a different thread. When the callback accesses the DataReader, i.e. to read/take data, a 'not-enabled' status will be returned (or exception raised). Solution: During creation the DataReader is enabled before events are triggered. This assures that listener callbacks will only be executed after the DataReader is enabled. |
OSPL-10931 |
A late joiner with masterPriority="0" can wrongly align from an already existing node
with masterPriority="0" if no master is available. The configuration setting //OpenSplice/DurabilityService/NameSpaces/Policy[@masterPriority] specifies the "willingness" to become a master for other nodes. The special value 0 is reserved for nodes that are not willing to act as aligner for other nodes. In case there exists a node (say A) with masterPriority 0 and a late joining node (say B) with masterPriority > 0 joins, then the late joining node B will successfully take mastership and node A will save to it. But in case B also has masterPriority=0 configured, then no alignment should take place. However, due to bug in the algorithm to determine a suitable aligner two nodes with masterPriority=0 could still align from each other. Solution: The bug has been fixed so that two nodes with masterPriority=0 do not align from each other. |
OSPL-10651 |
Invalid network interface name causes crash. When configuring a network interface name for RT networking that doesn't exist (and no alternative available), networking crashes. Solution: Due to incorrect annotation of a function, the compiler removed a result check, assuming the interface lookup would always succeed. This causes a crash in case the interface lookup did fail. |
Report ID. | Description |
---|---|
OSPL-9967 |
Possible crash when loading XML storage files in Durability service
There's a possibility that the Durability service crashes at startup. The root cause of this issue is that the buffer where the file name of the XML storage file is stored, is limited. Solution: The defect of the limited buffer is fixed and is now dynamically allocated. |
OSPL-11113 |
Possible SignalHandler crash during termination of an application
The SignalHandler within an application listens to signals posted to an application (like Ctrl-C). When an application shuts down, the SignalHandler also shuts down. The application should therefor not accept any new signals. In the case that after the SignalHandler was terminated a new signal was 'posted' to the application, the application could crash. Solution: When the SignalHandler of an application terminates, no new signals are accepted. |
OSPL-11094 |
Possibly missing persistent samples in persistent store.
If during startup of the durability service a persistent writer was created it was possible that data written by that writer was never stored in the persistent store. This happened when the writer created a group and the group existed before the persistent data reader, groupQueue, was created. The groupQueue only attached to groups via the NEW_GROUP event which could have been missed for already existing groups. Solution: Made the persistent data reader, groupQueue, attach to already existing groups. |
OSPL-11101 |
The legacy master selection by durability may not converge when the master selection is based on majority.
When the durability service is configured to use the legacy master selection algorithm then depending on the situation the master selection algorithm may resort to selecting the master based on the master that is proposed as master by the majority of known fellows. Under certain circumstances (for examples fellows become disconnected temporarily) this may cause that some fellows select different masters. Solution: When a durability services detects that another durability service has selected another master then a master conflict is raised which triggers an reevaluation of the selected master. |
OSPL-11114 |
Possible memory leak in domain participant destructor in IsoCpp2.
Sometimes during closing a domain participant an "ALREADY DELETED" exception was generated. As this exception wasn't handled, further clean-up of the domain participant wasn't performed which causes a memory leak. Solution: The exception is now handled and a proper clean up will be performed. |
OSPL-10769 |
Threads that cause an error when accessing shared memory detach the shared memory before generating a core.
When a thread that is accessing shared memory causes an exception to be raised, then the signal handler will detach from the shared memory prior to dumping the core. That makes the core pretty useless in many cases. Solution: The signal handler no longer detaches from shared memory when the thread that is raising the exception is accessing the shared memory at that moment in time. In all other cases the signal handler will still detach the shared memory in an attempt to leave the shared memory in a consistent state for other processes using the same federation. |
OSPL-10935 |
Add X-Types built-in Topic types to IsoCpp2.
The X-Types specification provides a few simple built-in Topic types: BytesTopicType, StringTopicType, KeyedBytesTopicType and KeyedStringTopicType in the dds::core namespace. These types should be available within the IsoCpp2 API. X-Types built-in Topic types are added to IsoCpp2. |
OSPL-11067 |
No documentation for the create_persistent_snapshot() exists for IsoCpp2.
A proprietary API function exists to create a snapshot of the persistent store. The IsoCpp2 API implements this function, but no documentation for this feature was provided. Documentation for the create_persistent_snapshot() API function call is provided for IsoCpp2. |
OSPL-10715 |
delete_contained_entities fails when the reader listener has an outstanding loan.
When using the Classic C++ API and using a DataReader with a listener for DATA_AVAILABLE it is possible that if during the listener callback the delete_contained_entities function is called it fails with PRECONDITION_NOT_MET if the listener callback does a read with a loan. The deletion of the reader did remove the listener interest but did not wait until the callbacks were finished. Solution: The deletion of the reader now waits until the listener callbacks are finished |
OSPL-10717 |
Unable to create readers for a topic starting with '_'.
Unable to create readers for a topic starting with '_', which is allowed in the DDS spec. Solution: Added '_' as allowed first character in topic names. |
OSPL-10810 |
Using nested idl files on windows for Classic C++ could cause build errors.
The idlpp/cppgen generated header files for Classic C++ contained an #undef of the import/export macro for Classic C++. When using nested idl files on Windows this could result in the import/exports not correctly set which could cause a build error. Solution: Removed the #undef from the generated header file. |
OSPL-10301 |
Error running vcredist when installing on Windows.
Any errors running vcredist on windows at installation time result in a warning prompt and final steps of the installation process are aborted. This occurs even if the error is because the vcredist pack is already installed. The cause of the error cannot be determined at installation time. Solution: Do not display warning or abort installation process if there is an error running vcredist. |
OSPL-10746 |
In a single process deployment mode a deadlock may occur when an application installs a signal handler to trigger a guard condition.
The deadlock occurs when a signal interrupts a thread that is currently accessing the guard condition and invokes the signal handler, in that case the lock of the guard condition is already taken and the trigger operation will deadlock when it tries to take the lock again. This problem only exists in the single process deployment mode because in shared memory deployments the signal handling is always outsourced to a dedicated signal handling thread. Solution: Enabled usage of the dedicated signal handler thread in single process deployment mode. |
OSPL-10771 / 18286 |
Group transactions may end up consuming a lot of memory.
When using group coherency features with a lot of writers in a group and a lot of non-overlapping transactions, the amount of memory needed to store redundant End Of Transaction messages (EOTs) may become quite big. Solution: Remove the redundancy between duplicate EOT messages on the receiving side. |
OSPL-10860 |
When using Java based tools license checking didn't check all locations.
When using a Java based tool and the prismtech_LICENSE was set to an invalid/expired license the secondary license location (OSPL_HOME/etc) was never checked to see if it contained a valid license, instead only the tertiary (VORTEX_DIR/license) was checked. This could lead to the tool not starting with a license error while a valid license was present in the secondary location. Solution: Made sure all license location are checked. |
OSPL-10929 / 18400 |
Customlib IsoCpp2 Debug platform configuration crashes when using with an OpenSplice release build.
When compiling the IsoCpp2 customlib for a debug platform configuration and use this build with a release OpenSplice version if can crash due to a missing preprosessor define. Solution: The issue is now fixed and the customlib for the Debug platform configuration is now generated properly. |
OSPL-10702 / 18253 |
Sequence of typedef'ed char results in incorrect data received when using ISOCPP2
The copy out route for sequences was incorrectly adding data to the end of the newly created sequence i.s.o the beginning. As the sequence was first initialized with zero, the end marker was placed after the initialized data, creating a sequence twice as long as needed with the first elements to be zero. Solution: Instead of adding data to the sequence, assign is now used. This will place new data at the beginning of a sequence. |
OSPL-8465 |
Google Protocol Buffers embedding inner messages with keys multiple times fail
If you have keys in an embedded message and then try to use that message more than once in your top-level message, only the first occurrence will become part of the key. Solution: It is now possible to embed inner messages that contain keys and they all will become part of the key. |
OSPL-10712 |
Durability native state conflict after initial alignment could trigger second merge
When the master federation had a durability namespace with a merge state which was not equal to the default, a fellow durability service with a queued native state conflict resolved instead of discarded the conflict after the initial merge had completed. This resulted in same data being aligned twice. Solution: After the initial merge is completed the namespace merge state on the fellow durability service is updated to match its master. |
OSPL-10924 / 18398 |
CPP generated code results in an error messge when using klocwork.
When using CPP as language during idlppp generated code compiling using klocwork an error "there is a copy constructor but no assignment operator" can occur. Solution: The problem is fixed and the correct assignment operator is now used. |
OSPL-9572 |
Launcher failed to run in Windows XP 32 bit operating system environment.
Launcher would not run on Windows XP 32 bit systems. The java packager generated an exe file that only supported os verions for Windows 7 and up. Solution: The .exe file was modified after generation to support Windows XP. Launcher is now supported for Windows XP 32 bit operating system environments. |
OSPL-9521 |
Launcher: Java Tools buttons disabled the first time launcher is opened after installation.
The code doing the check of the JAVA_HOME was using a 1 second timeout. On the first call, it would timeout, and the check would fail. Solution: The JAVA_HOME check code is done on a separate thread, then notifying the listeners (UI panes) that the check is complete. |
OSPL-10502 |
Launcher: In the Tools tab, provide an RnR Manager button to open the RnR Manager product.
Launcher provides a Tools tab that allows users to open OpenSplice tools. If RnR Manager is installed, display a tools button to open RnR Manager. Solution: Search for an RnR Manager installation in the expected default directory derived using OSPL_HOME. Check the version numbers, and select the highest version number. When an RnR Manager installation is found, in Launcher's Tools tab, display an RnR Manager button to open the RnR Manager product. |
OSPL-8069 |
Launcher: Make the file and directory choosers smarter in the Settings dialog / environment tab.
The file and directory choosers had no logic for the initial directory when choosing a file or directory. Solution: The file and directory choosers use the current value to set the choosers' initial directory. Smarter initial directory defaults are also provided if applicable. |
OSPL-10523 |
Tuner: Actions should automatically refresh tree views.
After user initiated actions, the user has to then also initiate a "Refresh entity tree" action. Make the refresh happen automatically where possible. Solution: For the actions: "Import data" and "Import metadata", the entity trees are automatically refreshed. |
OSPL-10513 |
Tuner: Should provide an option to filter out internal topics.
Tuner always displays all topics, including specification topics (DCPS), and OpenSplice product internal topics (CM, d_, q_, rr_). Solution: Added a Topic Filters tab in the Preferences dialog. This tab allows users to filter which topics are displayed in the tree views. By default all DCPS topics are shown, and all OpenSplice internal topics are hidden (CM, d_, q_, rr_). |
OSPL-10667 |
Tuner: Default partition empty string displayed as "null" in QoS tab.
The cmapi internally stores the default partition policy value as literal java null. So in Tuner, when it sees the partition value null, it would show use the string "null" as the value in the QoS view table. Solution: In the QoS view table, the null is now displayed as an empty string. |
OSPL-10345 / 17884 / 17883 / 17882 / 17880 |
Added filtering for server ID to Dynamic network configuration for RT Network service.
To effectively utilize dynamic network configuration updating, filtering on server ID needs to be added. Always updating all OpenSplice instances on a network with a new network configuration does not provide practical functionality. Solution: The new filtering has been added to the RT Network service. |
OSPL-10371 / 17881 |
New network partition does not result in updated partition mapping for RT Network service.
When dynamically updating an RT Networking network configuration, adding a new network partition does not result in the proper updating of the partition mappings. The partition mappings need to be updated prior to notifying the channels of the new network partition and mappings. The symptom of this problem is that the new network partition is added, but a topic message is sent over the global network partition rather the new network partition. Solution: The defect is now fixed and a new network partition now results in an updated partition mapping. |
OSPL-10396 |
Idlpp crashes when compiling a sequence of a typedef to a sequence to C or C99.
The IDL compiler idlpp crashes when it needs to generate the C or C99 representation for a sequence of a typedef to a sequence, or it generates code that doesn't work correctly. Solution: The idlpp code has been modified to generate the correct C/C99 representation and to no longer crash during this generation process. |
OSPL-10576 |
Leakage measurements on single process applications show a lot of heap leakage.
Most of the heap leakage found is allocated by the internal database. In single process deployment the database allocates data on heap instead of in a separate shared memory storage as within the shared memory deployment, meaning that destruction of the database itself doesn't automatically free all remaining allocated data. This leakage is in fact not a resource problem but makes it hard to identify real issues. Solution: The database shall free all remaining data objects on destruction. |
OSPL-10659 |
When a node is rebooted persistent data may be inserted as unregistered even if there is no evidence that the original publisher has left.
When a node that has configured persistency is rebooted and the data in its persistent set is the same as the set that would otherwise be provided by the aligner, then it is more efficient to load the data from disk than requesting the data form the aligner. Currently, the persistent store always assumes that data published from the persistent store is unregistered. Since the unregistered state is taken into account when calculating a hash over the set, hashes are likely to differ causing alignment. To prevent this situation the injected data should marked as implicit, which also better matches the semantics that the injecting node deduces that there is no writer. Solution: When messages are injected from the persistent store they are injected as being implicitly unregistered instead of explicitly unregistered. |
OSPL-10688 |
Simulink: Reader block may miss samples when configured to wait for available data.
A Reader block configured to wait for available data always creates a wait set in every step, and only reads samples if the wait set is triggered. This implementation misunderstands the nature of the triggering of the 'data available' status condition. Data available is only triggered when DDS places new data into the reader's local buffers. It is possible for a Reader block to wait unsuccessfully for available data, but still have unread data locally available. Solution: The Reader block has changed to do the following: 1) attempt to read or take (as configured); 2) if no samples are returned, wait for available data (if so configured); 3) if available data is triggered in step 2, then read or take samples (as configured). |
OSPL-10689 |
Simulink: Writer block does not write samples in step in which it waits for a publication match.
A Writer block configured to wait for a publication match will not write samples in the same step in which it waits for a publication match. This may result in an application failing to write samples some samples. Solution: The Writer block was changed to write after waiting for a publication match, so long as a match was found or the block was configured to write even if a match was not found. |
OSPL-10699 / 18251 |
Network service failed to start due to missing system clock.
There is a mis-match between the platform configuration specified by the tool chain and what the platform actually supports. This causes clock_gettime to return an error code. As there is no error checking on this call, a fail will not be detected and the clock isn't working. Solution: In case the call fails because of using the BOOT_CLOCK, retry the call using MONOTONIC clock. |
OSPL-10756 |
Simulink: Compilation errors in Simulink Coder concurrent execution models when using Reader block.
Simulink allows you to configure a model for 'Concurrent Execution'. The basic requirement for this is that each thread of execution is represented in a separate Simulink model, and then these models are referenced from a top-level Simulink model using the 'Model' block. The top-level model is then configured for concurrent execution. The details of doing this are described in the following MathWorks article: Configure Your Model for Concurrent Execution. When such concurrent models are built into executables with Simulink Coder, compilation errors result if the DDS Reader block was used in any of the referenced models. The error indicates that the include file "dds.h" cannot be found, and occurs as Coder attempts to compile the C code for the top-level model. Solution: The generated code for the Reader block has been changed so that "dds.h" is no longer required when compiling the top-level model. |
OSPL-10865 |
Durability services fail to align in Device-to-Device deployment with Cloud.
Durability doesn't consider fellows behind a Cloud service responsive because it fails to match a fellow's readers with that fellow's id. The readers and writers are properly discovered and data will flow. Solution: The issue is now fixed and the durability service will align when using the cloud service. |
OSPL-10954 / 18403 |
On the classic C++ API the registration of an already registered typesupport leaks memory.
When a typesupport is registered then the existing typesupport collection is searched to find if the typesupport is already registered. When an already registered typesupport is found it reference count in increased but not released after being used. This causes a memory leak. Solution: The found typesupport is released after being used. |
OSPL-10967 |
Issues with sequence/array of enums and multi-dimensional arrays in C#.
The IDL compiler idlpp generates incorrect C# marshaling functions for the following IDL constructions: arrays or sequences of enums multi-dimensional arrays in general where #dimensions > 2 The issues cause either idlpp to crash when generating the code, the marshaler to crash when executing the code, or the wrong data to be written into or read out of the system. Solution: The idlpp code used to generate the C# marshalers has been corrected and no longer suffers from these issues. |
OSPL-11083 |
C# compilation failure in PingPong example prior to .NET 4.0.
The PingPong C# example calls Stopwatch.restart in pinger.cs. The stopwatch.restart method was not added until .NET 4.0 Solution: The problem is fixed and the restart call is replaced with a call that works in .NET 3.5 |
OSPL-11020 |
Identical sample requests from the same alignee are combined by an aligner durability service.
A durability service can combine sample requests to optimize alignment efficiency. When two sample requests for the same data originate from different alignees the sample requests should be combined. However, when two identical sample requests originate from the same alignee it makes no sense to combine them. Combining them does not lead to a inconsistent state, but it may cause that other nodes receive such. Solution: Before actually combining the request, the aligning durability service checks if it already has a pending request that addresses the requesting alignee. If so, the request is not combined. |
OSPL-10871 |
The unregister resulting from the delete of writer may be ignored on systems with low time resolution.
When a writer is deleted directly after writing an instance for the first time and the clock resolution of the system is very poor then it may occur that the unregister message resulting from the deletion of the writer gets the same timestamp as the sample and the sequence number of the unregister message is set to 0. This cause that the unregister message is considered older and is ignored. Solution: The sequence number of the unregister message is set the maximum value which cause that the unregister message is processed correctly. |
OSPL-11087 |
Issues with unions having a sequence of char or sequence of boolean in C#.
When using idlpp to compile a union that has a branch with a sequence of char or a sequence of boolean, the C# backend either crashes, or generates code that does not compile properly. Solution: The C# backend for idlpp now generates the correct code for handling these cases and no longer crashes on them. |
OSPL-11131 |
Idlpp backend for C# may generate incorrect marshaling code for sequences
The C# Marshalers generated by idlpp do not always contain the correct way to obtain the type description for attributes of type sequence: the function call that looks up the type description uses the C# attribute name rather than the IDL attribute name that is used to index the types. In most cases both names are equal, and so there will be no impact, but when both names differ, the Marshaler may crash with a System.NullReferenceException when writing samples. The IDL attribibute name and its C# counterpart may differ when the IDL name is based on a C# keyword (in which case its C# representation is prefixed with an underscore) or when the idlpp option "-o custom-psm" is used, in which case C# attribute names may be modified using the PascalCase notation. Solution: Idlpp now always uses the IDL attribute name when looking up a type descriptor. |
OSPL-10728 |
The use of inline arrays of pointers to structured types that have an alignment not equal to 4 bytes on a 32 bits platform may cause data corruption or crashes.
The problem is caused by an incorrect alignment calculation for arrays of pointers. The array alignment should be equal to the alignment of the pointer (4 bytes on a 32 bits platform) but is actually equal to the alignment of the pointed type. As a result the memory offset of the array can be incorrect causing data corruption or crashes. Solution: Fixed the alignment calculation. |
OSPL-10795 |
Build error when building from source on Windows with Visual Studio 2015 update 3.
The IsoCpp2 API could not be built on Windows with Visual Studio 2015 update 3 because of a dependency on the in Visual Studio 2015 update 3 not set _HAS_CPP0X. This caused C++11 to be disabled for some parts. Solution: Removed the _HAS_CPP0X dependency for Visual Studio 2015 and later for enabling C++11 support. |
OSPL-10818 / 18358 |
Typedef of a Sequence of a struct generation for C99 is faulty.
When using C99 and an idl which contains a typedef of a sequence of a struct the generated code by idlpp omits the definition of the struct. Solution: The problem is fixed and a correct definition is now generated by idlpp. |
OSPL-10958 |
Writing samples that contain a sequence of a primitive type in C# could lead to an application crash.
When using C# and having a structure in IDL that contains a member of type sequence Solution: The problem is fixed and correct C# code is now generated by idlpp. |
TSTTOOL-485 |
Tester created readers unable to receive historical data samples from transient durability topics.
An error in the way Tester was creating data readers resulted in historical transient data to never be read and displayed in the sample list. Solution: The error is fixed and historical data is visible in the sample list on reader creation once again. |
Report ID. | Description |
---|---|
OSPL-10489 |
Reader port removed from Vortex DDS Reader block for Simulink
The optional 'reader' port on the Vortex DDS Reader block for Simulink served no purpose. Solution: The port has been removed. Existing models that include a reader port will have the port removed. Any connectors connected to the reader port must be removed. |
TSTTOOL-474 |
Python Scripting: Exception on exit
In the Python scripting environment (osplscript), creating a 'Recorder' (osplscript.recorder.Recorder) that is never used will result in an error on exit indicating a java.lang.NullPointerException. Solution: Guard code has been introduced to ensure that clean-up code is called only when needed. |
TSTTOOL-470 |
Tester cannot write enum values whose labels start with its type name.
In some cases it was possible for the Tester to fail to write certain samples of certain types. The triggering type of data is an enumeration type whose label names start with the name of the enum type. When encountered, that type name string is cut out of the outgoing data value for the enum, which results in a failure because the middleware no longer recognizes the enum value it is trying to write. Solution: Tester no longer strips out the type name from the enum label. |
OSPL-10499 |
Launcher Configurations tab User Interface Improvements
In the Launcher "Configurations" tab, the purpose of a configuration may not be clear to a user. Also, some of the functionality in the tab may not be intuitive to a user. Solution: Changes were made to the "Configurations" tab to address usabilty. These include: adding descriptive text, adding a link to the deployment guide, highlighting the ACTIVE configuration, disabling buttons when they are not applicable, a new right click menu for the configuration selection, and changing double click functionality to open a configuration instead of setting ACTIVE. |
OSPL-10497 |
Launcher Tools and Controls tabs should explain why certain options are greyed out and
disabled.
In the Launcher "Tools" and "Controls" tabs, access to tools and controls is done using buttons. In certain conditions, these buttons are disabled and shown as greyed out. It may not be clear why they are disabled. Solution: Tooltips were added on mouse hover to give users an explanation as to why the buttons are disabled. |
OSPL-10496 |
Launcher Tools and Controls tabs - provide a description of the tools and controls.
In the Launcher "Tools" and "Controls" tabs, access to tools and controls is done using buttons. It may not be clear to the users what these tools and controls are for. Solution: Tooltips were added on mouse hover to give users a description of button functionality. |
OSPL-10487 |
Error Dialog from DDS Topic block for Simulink
In MATLAB/Simulink R2017a, Simulink models using the Topic block from the Vortex DDS Block Set could see an error dialog with the following text "Error evaluating 'MaskDialog' callback of SubSystem block (mask)". The error arises from API changes between MATLAB R2016b and R2017a. Solution: Solved the problem by correctly calling the revised APIs. |
OSPL-10486 |
Vortex DDS block labels displaying incorrectly in MATLAB R2017a
In MATLAB/Simulink R2017a, Simulink models that use the Vortex DDS Block Set blocks, and that hide one or more of the optional block ports, question marks (???) were overlaid on the block icon. Note that this behaviour did not impact execution of blocks. Solution: With the OpenSplice 6.8.2, all DDS blocks now display their block labels correctly. |
OSPL-8573 |
Tuner - Various errors and exceptions thrown through regular UI actions.
Certain UI actions in the Tuner tool consistently cause exceptions to be printed to the console, and the expected UI action to fail. Particularly the Export Data menu items in the main window and some child windows. Solution: The causes of the exceptions being thrown have been fixed, and the corresponding UI menu actions work properly. |
OSPL-4705 |
Launcher preferences need to be stored in a separate location for each OpenSplice
installation.
Launcher stored all preferences in a .olauncherprefs file in the user home directory. If a user has multiple installations of OpenSplice, Launcher will reuse that file for all installations. This caused some confusion. (ex. licensing, resetting OSPL_HOME) Solution: The OSPL version was appended to the .olauncherprefs file name, so that each installation will have its own launcher preference file. For example: .olauncherprefs6.8.2. If the OSPL_HOME variable is not set, the preferences file name will default to .olauncherprefs. |
OSPL-10595 |
Error when compiling ISOCPP2 with Visual Studio 2015 Update 3 or higher
When trying to compile the ISOCPP2 library, or any ISOCPP2 based application using Visual Studio 2015 Update 3 or any higher version, several compilation errors are encountered. Solution: The compilation flags required to successfully build on Visual Studio 2015 Update 3 and higher are now added to the build system. |
OSPL-10366 |
Unclear when persistent snapshot if successfully created.
The function create_persistent_snapshot on the Domain object operates asynchronously: a thread in the durability process is responsible for making the snapshot, and the function create_persistent_snapshot returns immediately after the durability service has been instructed to make the snapshot. Therefore, it is unclear to the caller when the snapshot has safely been flushed to disk. Solution: The durability service will now log a message to the ospl-info.log when it successfully flushed the snapshot to disk. This log references a unique sequence number for each snapshot and can easily be made available for inspection by an application by configuring a ReportPlugin in your OpenSplice config file. |
OSPL-10645 |
The AutoBuiltinTopic namespace created by the durability service uses the legacy master
selection algorithm which may cause problems when the others use the priority master
selection algorithm.
When the durability service configuration contains a namespace which includes the builtin topics and it is not configured as an aligner for that namespace then the durability service will create an AutoBuiltinTopic namespace for the builtin topics. However this namespace always uses the legacy master selection algorithm. When another durability service is configured as aligner for a namespace containing the builtin topics and which uses the priority master selection algorithm a mismatch between those namespaces is detected and reported. Solution: When the durability service creates the AutoBuiltinTopic namespace it should used the configured master selection algorithm. |
OSPL-10644 |
CM java API Entity toString could return null
The CM java API could return null when the Entity.toString function was called for an entity without a name. When trying to print an Entity this function was called but not all jvm implementations are able to handle a null string which could lead to a NullPointerException. Solution: When toString() is called on an Entity with no name the simple name is returned. |
OSPL-10641 |
Incorrect log messages when bringing down network interface
Incorrect log messages occur when using the RT Networkservice and having multiple virtual adapters configured and bringing one of the virtual devices down. In the ospl-info log it is reported as that the main adapter went down which is not the case. Solution: The incorrect adapter name has been fixed and the log now only reports information when changes occur on the actual used adapter by the RT networkservice. |
OSPL-10596 |
In case a durability service receives many unrequested chains then its log file could
get polluted
When a durability service combines sample requests it will send the combined answer to all durability services. The answer is only intended for the durability services that requested for the data, all other will drop drop the data. In case a durability service receives such unrequested answer this results in a log message for every message that is received. In case there are many unrequested answers this lead to many message and hence polluted the durability log file. Solution: The line that was responsible for log pollution has been removed |
OSPL-10590 |
ospl tool might miss spliced termination
The ospl tool has a 2 second period it initially waits for the splice daemon. When the splice daemon terminated within those 2 seconds because of a normal termination request the ospl tool missed the termination and kept waiting for maximum time of 60 seconds before exiting. Could only happen when ospl start and ospl stop where called form different threads. Solution: ospl tool is now aware of normal termination within 2 seconds. |
OSPL-10585 |
Writing samples that contain a sequence of a primitive type in C# does not work
When using C# and having a structure in IDL that contains a member of type sequence Solution: The problem is fixed and a writer in C# can now write the sequences. |
OSPL-10573 |
Unable to create a shared memory segment larger than 1.9 GB
SetFilePointer is used to set a pointer in memory. Only the low pointer is used, limited the range to 2 GB Solution: SetFilePointer also accepts a (optional) second 32 bit value which can be used to access file locations beyond the 2GB barrier. The code have been changed to used this 2nd parameter. |
OSPL-10571 |
OpenSplice refuses to start when using a decimal human readable size value in the
configuration
When having a configuration with a decimal human readable size value i.e. a database size set to 0.2G OpenSplice refuses to start. Solution: The fault in the configuration checker was fixed and now decimal human readable sizes are accepted. |
OSPL-10563 |
When the master selection algorithm is unable to select a master in time it chooses a
master based on majority voting without generating a warning. Also, the threshold to
resort to majority voting was ridiculously high.
When there are multiple durability services in a system these durability services will have to choose a master for each namespace. The master is the one that is responsible to align the other other durability services. In case of the legacy master algorithm several iterations are being carried out if there is disagreement who should be the master. If after a fixed number of rounds there is still no agreement then majority voting is used to force a master. Because this situation may lead to improper alignment the fact that majority voting is used justifies a warning in the log files. However, this situation was not logged. Solution: A warning will be printed in the ospl-info.log when majority voting is used to choose a master, and the threshold to decide when majority voting will be used has been reduced. |
OSPL-10534 |
ddsi2(e) may crash due to potential corruption of internal hash table.
In certain scenario's the ddsi2(e) service may crash due to corruption of an internal hash table. This is caused by the hash table not resetting slot pointers when an entry needs to be moved to another slot. In most cases the bogus pointer will quickly be overwritten by another entry and no harm is done. However, if this is not the case and the original entry is removed, a dangling pointer remains in the hash table which may crash ddsi2(e). Solution: The slot pointer is now reset to NULL after an entry is moved to another slot. |
OSPL-10522 |
Tuner import of transient data isn't received by late joiners
When the tuner imports transient data it is not received by late joining readers because the imported data is published by a writer with auto_dispose_unregister=true and is thus purged from the system. Solution: Tuner publishes imported data with a writer with auto_dispose_unregister=false |
OSPL-10465 |
ospl start terminates splice daemon before it has the chance to become operational
the ospl tool had a hardcoded 10 second timeout in with the splice daemon had to become operational, if not it would terminate the splice daemon and report an error. In some occasion this timeout had proven to be to short. Solution: Increased the timeout to 60 seconds, this is a worst-case value in most situation this won't effect behavior. |
OSPL-10462 |
NullPointerException during listener termination.
When using the Java API with listeners a NullPointerException can occur when deleting an entity which has a listener attached. Solution: The cause of the fault is still unclear but a workaround has been implemented to catch the exception so the listener will terminate correctly. When this happens a trace will also be added to the ospl-error.log |
OSPL-10423 |
Sending out an array of a typedef of a string in C may cause an application crash
When you model an array of a typedef of a string in your IDL and generate C code from this model, then the resulting code may corrupt/crash your application when you try to write samples containing such an array due to incorrect dereferencing of its pointer. Solution: Correct operator precedence has now been enforced by explicitly using brackets in the generated code. |
OSPL-10425 |
Possible error return by ospl tool while splice daemon is started
When the ospl tool started the splice daemon and the starting took longer then the maximum allowed start time the ospl tool returned an error while the splice daemon was still running. Solution: When the maximum allow start time has passed the ospl tool now kills the splice daemon and reports an error. |
OSPL-10423 |
For sample attributes of type array only the first 4 or 8 bytes are actually marshaled
out in ISOCPP.
In ISOCPP (not ISOCPP2), sample attributes of type array are only marshaled out using the first 4 bytes (for 32-bit platforms) or the first 8 bytes (for 64-bit platforms), the rest of the array may be filled with garbage. This might cause data corruption or even a crash of the marshaling algorithms on either the sending or the receiving side. This was caused by an incorrect way to establish the size of the array, which returned the size of the pointer to the array instead of the size of its content. Solution: The algorithm now correctly establishes the size of an array. |
OSPL-10420 |
Wait for historical data timeout when using multiple readers and transient local with
DDSI2(E)
When using the DDSI2(e) network service with transient local durability set on a topic it can happen that in the situation when creating 2 nodes where 1 node is the writer and the other node creates 2 or more readers that wait for historical data call on the secondly created reader returns timeout when there is no data in the system for that topic. Solution: The problem is fixed and when this scenario happens the created readers will not return timeout anymore when there is no data for them. |
OSPL-10413 |
When using delayed alignment the durability service it may occur that a master conflict
is not resolved.
When the durability service configuration contains a namespace which is configured to use delayed alignment it may occur that the master selection for that namespace does not converge. The cause for this problem is that when delayed alignment is used the master selection does not take into account the system id and the quality of the namespace. This causes each time a durability instance detects a master conflict it will first select it self as the new master. This may cause a master conflict at another durability instance. This process repeats it self and results that no final master is selected. Solution: In case of delayed alignment the master selection algorithm should use both the quality of the namespace and the system id in to account when selecting a master. |
OSPL-10382 |
Missing metaconfig error message logged when running iShapes demo
The metaconfig.xml file is missing from the iShapes demo installer. Solution: Added the metaconfig.xml file to the iShapes demo installer and added a priority search location for it in the current directory |
OSPL-10376 |
Wrong display of octet in Tuner
When writing an octet field with the Tuner it is not possible to writer values > 127. Solution: The problem is fixed and the Tuner can now write values for octets from 0 to 255. |
OSPL-10362 |
NullPointerException in the Tuner.
When using the Tuner to read a sample which contains a union inside a nested structure a NullPointerException could occur. Solution: The Tuner cannot handle some cases where a union is inside a nested structure as a workaround the NullPointerException is being catched so the Tuner wont crash and the sample can be read only the value of the union wont be displayed correctly in such a case. |
OSPL-10361 |
Google Protobuffer meta data injecting not efficient on isocpp2
Google Protobuffer meta data was generated into a hex string and runtime translated into a byte array using the sscanf function, for big types this were many sscanf calls. Solution: The Google Protobuffer isocpp2 meta data is now generated in a byte array so no runtime translation is required. |
OSPL-10360 |
On vxworks a condition variable may remain blocked after being signaled.
On vxworks for each thread that uses a shared condition variable a named semaphore is allocated that is used to wake up this thread when it waits on a shared condition variable. When a thread starts waiting on a shared condition variable it registers this thread specific semaphore with the condition variable and waits on this semaphore. When another thread signals the condition variable it will post the semaphore that is registered with the condition variable to wake up the corresponding thread. The name used for this thread specific semaphore should be unique. However threads created withi the same RTP could generate the same name for this semaphore which could cause that the wrong semaphore is released when triggering a condition variable. Solution: The name generated for the named semaphore which is used for each thread is determined atomically within a RTP. |
OSPL-10359 |
Fault in QoSProvider schema.
The provided DDS_QoSProfile.xsd contains an element dds_ccm this element is not suitable for OpenSplice and needs to become dds. The dds_ccm is introduced by an inconsistency in the OMG spec itself. Solution: The element changed from dds_ccm to dds fixing the inconsistency. |
OSPL-10358 |
Possible stack overrun when returning the loan after reading a large amount of instances
When having a large amount of instances and do a single read it is possible to get a stack overrun when returning the loan after a read. Solution: The internal free of the loans was changed to behave iterative in stead of recursive. |
OSPL-10188 |
Memory leak in flex/bison generated parser
Memory leak in flex/bison generated parser in path to make it thread safe. Solution: Removed memory leak by making parser reentrant (requires bison version >= 2.7) |
OSPL-10174 |
Master selection algorithm takes too long to decide on master
The legacy master selection algorithm (which is selected by default and which should be used when communicating with nodes that run < V6.8.1) may sometimes take too long to decide on who becomes the master. When a node needs to confirm its vote, it resets its master selection in the mean time, resulting in a non-resolved conflict and another master selection cycle until finally majority voting finally kicks in. However, all these extra voting cycles may cause increased alignment times. Solution: The master selection is no longer reset between the establishment of the master and its confirmation. |
OSPL-10169 |
The RT networking service does not use active garbage collection of buffers used by a
best-effort channel.
The networking packets (fragments) received by a RT networking channel are stored in defragmentation buffers. When messages are fragmented over several packets and when a packet is lost the best-effort channel will maintain packets it has already received because packets may arrive out of order. However the networking service does not apply active garbage collection of these buffers. It will apply garbage collection when the number of free buffers becomes low. Because no active garbage collection is applied to the buffers of a best-effort channel the number of used buffers my increase considerably when large number of packet loss occurs on the network. Solution: For a best-effort channel apply garbage collection of defragmentation buffers containing incomplete messages at regular intervals. The garbage collector will free defragmentation buffers that contain incomplete message when a threshold is exceeded. This allows that packets that arrive out of order can still be delivered to the application. |
OSPL-10168 |
Ownership transfer may fail when using a deadline.
Ownership transfer occurs when an instance is unregistered, or when it misses its deadline. When a disconnect occurs to an instance that also has a deadline applied to it, and the deadline expires before the loss of liveliness is detected, then the deadline mechanism is the first to transfer the ownership away from the disconnected owner. However, when later the loss of liveliness is also being processed while no new ownership has been claimed in the mean time, then the unregister message that communicates the loss of liveliness will reclaim ownership for the now disconnected owner. Lower strength writers that try to transmit messages afterward will never be able to claim ownership over the instance anymore, and thus will not be able to deliver their data. Solution: Unregister messages will no longer be able to claim ownership for instances that currently have no owner. |
OSPL-10157 |
Coherent historical EOT purging inefficient (high load)
In a system with a large number of transient/persistent topic/group coherent writers the EndOfTransactions (EOTs) purging could take a long time as it was implemented as a loop within a loop with reference checking which was called often called both synchronous and asynchronous. Solution: Simplified the purging by removing the loops and making it synchronous only |
OSPL-10143 |
Creating a group coherent writer in the default partition always created an error log message
When creating a transient/persistent group coherent writer in the default partition an error message was logged stating: " Solution: Updated the group coherent partition nameSpace matching so that it correctly handles the default partition |
OSPL-10139 |
The isocpp2 read/take operation using a selector in combination with an iterator returns
the incorrect number of samples
The isocpp2 read/take operation using a selector in combination with an iterator ignores the provided max_samples parameter which causes the operation returns the incorrect number of samples. Instead of using the provided max_samples parameters it uses the max_samples attribute set on the selector. Solution: The provided max_samples parameter is used to limit the number of samples returned. |
OSPL-10113 |
Merging with an empty set could lead to a state update even when a data set was not changed.
This triggers unnecessary alignment and may cause resurrection of data.
In case a durability service has configured a MERGE policy and it merges with an empty set, then the durability will still publish a state update of its set even though its state has not changed. The state update in turn may cause other nodes to initiate alignment and hecne cause resurrection of data that was already aligned an taken before. Evidently, using the MERGE policy with an empty set does not change its set, and no state update should be generated. Solution: When a MERGE policy is applied with an empty set no state update is generated any more. |
OSPL-10101 |
Possible hang in delete_domainparticipant when splice daemon would not join in singleprocess mode
If the splice daemon thread (only singleprocess) would not exit the join on it would block indefinit Solution: The join on the splice daemon is conditional, when unjoinable an error is returned. |
OSPL-9706 |
Unattended installer ignores providedLicenseFile argument
When doing an unattended install of OpenSplice and providing a valid providedLicenseFile argument. The installer ignores the argument. Solution: The problem is fixed and the installer now uses the given providedLicenseFile argument to set the license file. |
OSPL-7563 |
The Custom_Lib solution files for MS Windows do not contain a Debug configuration.
To build the various custom_lib's on MS Windows solution files are included. These solution files contain only a Release configuration. To support the debugging of customer applications a Debug configuration should be added to the provided solution files. This allows the customer to build a debug version of the custom libraries. Solution: A Debug configuration is added to the custom_lib solution files. The library name of the debug version of the custom lib is extended with the letter 'd'. For example the debug version of the standalone C++ API will be called dcpssacppd.dll instead of dcpssacpp.dll. The same applies to the other custom libraries. |
OSPL-10343 / 17878 |
The durability service of a node that is configured as a non-aligner
for a namespace may not become complete even if there is an aligner available.
In case a durability service of node is configured as a non-aligner for a namespace is initializing and it encounters an aligner that is already complete, then the non-aligner may have missed the message where the aligner indicates that its group are complete. Even though the non-aligner notices that it needs to request the state of the groups from the aligner, this is blocked by the non-aligner waiting for all its groups to get complete. This effectively may lead to a situation where the durability service does not make any progress. Solution: In case the non-aligner is not complete any new events are not being blocked anymore. This causes the non-aligner to request the groups when the aligner appears, notices that they are all complete, and proceeds again. |
Report ID. | Description |
---|---|
OSPL-8444 |
Unable to set the "Don't fragment" bit on outgoing UDP packets No configuration option to set this bit. Solution: Added a configuration option "DontFragment", located at (S)NetworkingService/Channels/Channel/Sending and NetworkingService/Discovery/Sending |
OSPL-9882 |
Linux: MATLAB/Simulink hangs when connecting to shared memory domain On Linux, a MATLAB script or Simulink model connecting to a Vortex OpenSplice domain via shared memory will hang. Solution: MATLAB, like Java applications requires that the environment variable LD_PRELOAD be set to reference the active Java installations libjsig.so library. The MATLAB user interface uses Java, and thus requires the same signal handling strategy as Java applications connecting to Vortex OpenSplice. The precise syntax for setting the LD_PRELOAD environment variable will depend on th shell being used. For Oracle JVMs, LD_PRELOAD should contain this value: $JAVA_HOME/jre/lib/amd64/libjsig.so |
OSPL-9893 |
Simulink unbounded string default dimension A Simulink user needs to be able to specify the maximum dimension of the character arrays created by IDLPP for the DDS 'string' type. Currently IDLPP generates a maximum dimension of 256, which is not user override-able. Solution: The IDLPP tool has been updated with additional functionality for generating Simulink bus definitions. For unbounded strings (and sequences in general), the IDLPP tool generates a .properties file that can be edited with Simulink specific bounds for these string and sequence types. See the DDSSimulinkUserGuide for more information on the IDL properties file. |
OSPL-9895 |
IDL types must have unique names before importing into Simulink to create
the Simulink Bus. Currently all IDL types must have unique names so importing IDL into Simulink does not overwrite bus types, since Simulink bus types must have unique names. The IDLPP tool has been updated with additional functionality for generating Simulink bus definitions. When the tool detects a potential clash in struct names, it generates a .properties file that can be edited with Simulink specific struct name overrides for the bus definition. See the DDSSimulinkUserGuide for more information on the IDL properties file. |
OSPL-9932 |
Warnings reported in MATLAB when invoking idlImportSl after a fresh MATLAB restart running idlpp the first time after starting MATLAB causes the following warning. Running the same command again, the warning is gone. >> Vortex.idlImportSl('zz.idl', 'data.sldd') Warning: Cannot use definition of enumerated type 'Color' in dictionary '/home/prismtech/git/osplo/src/tools/matlab/examples/idl/data.sldd' because it is defined externally. You must remove the externally defined MCOS class before you can use the dictionary definition. > In Vortex.idlImportSl (line 14) Warning: Cannot use definition of enumerated type 'Enum_Day' in dictionary '/home/prismtech/git/osplo/src/tools/matlab/examples/idl/data.sldd' because it is defined externally. You must remove the externally defined MCOS class before you can use the dictionary definition. Solution: The Vortex.idlImportSl script has been fixed so as to avoid producing this warning. |
OSPL-9994 |
Setting a QoS file in Simulink block parameter uses an absolute path
to specify the file URI An absolute path is used to specify the file URI. This means the QoS profile has to be set for each block without the default QoS. Solution: The Simulink DDS blocks have been updated such that Qos file uri is now set during each block's initFcn instead of at selection time of Qos file name in the mask dialog. Uri is also always relative to matlab's current working directory, which is re-resolved during diagram update/simulation start. |
OSPL-10100 |
Spliced can crash during termination when thread not alive When the splice daemon was termination it was possible that it crashed when on of its threads was deadlocked. When it was deadlocked it was not joined and the resources where freed, when the deadlock thread became alive again due to the freeing it could crash because it used freed memory. Solution: A timed thread join is now always performed when the join fails cleanup is stopped and spliced exits |
OSPL-10137 / 17800 |
Error when using multiple participants in single process deployment. When using single process deployment and creating domainparticipants for the same domain in multiple threads spawned from the main thread an Exception : Error: Failed to create DomainParticipant appears. Solution: It is now possible to create multiple participants at the same time for the same domain in single process. |
OSPL-10185 |
The classic C++ leaks memory that is allocated for the default QoS values The default QoS values in the C++ API are stored in static variables. When the application exits the memory allocated for these default QoS values is not freed. The same occurs for the DomainParticipantFactory. Solution: The identified leakage has been resolved. |
OSPL-10315 |
When a service exits some memory is leaked When a service is started it registers a cleanup handler. When the service exits the associated memory is not freed causing a small memory leak. Solution: The identified leakage has been resolved. |
OSPL-10324 |
Simulink - remove "status" ports from Domain, Topic, Publisher and Subscriber blocks The status port for the participant, topic, subscriber, and publisher blocks does not offer any value. The status is only used to detect errors - that when they occur, the simulation stops. Solution: The status port value for these blocks have been made not accessible to the simulation. |
OSPL-10367 / 17969 |
JVM crash when using multiple definitions of the same Enum value. When using the Java API and use different idls which have multiple definitions of the same Enum value in the same namespace the JVM crashes where it should give an error report. Solution: The problem is fixed and when this scenario happens a valid error report will be logged and the JVM will not crash anymore. |
TSTTOOL-469 / 17530 |
Max Samples Kept Per Reader preference not always respected In certain use cases, the max samples kept per reader preference does not work. The sample list tab table can have more than the expected number of samples per reader. This can be seen when a reader has a high number of samples coming in, at a fast rate. Solution: The back end data model was not removing samples from the display list correctly. |
Report ID. | Description |
---|---|
OSPL-9497 |
When the durability namespace equality check is enabled a merge
action may fail when combining alignment requests When the equality check is enabled for a namespace the durability service will check if the aligner data set is the same as the data set from the alignee when applying a merge policy. However when the aligner tries to combine requests, which is controlled by the RequestCombinePeriod, it incorrectly compares only the data set from the first request which may cause that it concludes that the data sets are the same and no merge of the data sets has to be performed however the other request may have a data set which is not equal. Solution: When the durability service tries to combine the data requests associated with a namespace it checks if all the associated data sets are the same or not. It may combine requests that either are the same as it's own, thus those sets that do not require an alignment of the data, and it may combine data sets that are not equal to it's own set and which will require an alignment. |
OSPL-9553 / 17187 |
When a wildcard partition is used for a datawriter data samples
may be sent twice. When a datawriter is created using a publisher which has a partition QosPolicy set to a wildcard partition then depending on the situation (timing) it may occur that the datawriter is attached to the same partition (group) when matching the partition expression with the existing partitions. This will cause that each sample will be written twice. When the datawriter is created it will try to match the partition expression with the existing partitions. However the spliced daemon will also try to match existing datawriters with newly created partitions. This may cause that the datawriter is attached to the same partition (group) twice. Solution: When matching the datawriter with a partition it is checked if the datawriter is already connected to that partition. |
OSPL-10183 |
Some memory leaks in the durability and networking service. When terminating the durability service it may leak the memory associated with the configured policies and merge state. When terminating the networking (ddsi) service may leak the memory allocated which is used to associate the networking service with a topic and partition combination. Solution: The identified leakage has been resolved. |
OSPL-10200 / 17818 |
Unable to start a service using a script. With the introduction of the configuration validator, it is impossible to use a script to start a service. Solution: When the service executable name isn't one of the defaults (durability, networking etc) but a shell script or other executable, check if the specified file exists and is executable (proper flags set). In that case, also accept that as a valid input. |
OSPL-10201 / 17819 |
Out of resources when creating and destroying a domain participant. In a very rare situation during closing of the domain just after opening it, a thread that has exited was not joined due to the fast shutdown after startup. This leads to memory not being freed. After about 32768 of this cases, the system runs out of resources, no new threads can be created anymore. Solution: Always wait for the thread in case we expect it to have run. If it's running, we will wait for it to exit. If it already has exited, the wait will return immediately, but does free the resources held by the thread. |
OSPL-10272 |
Not all ddsi2 threads are able to track CPU progress. Ddsi2 has the ability to track progress of its thread in terms of the number of CPU cycles spent by these threads. This helps in identifying the root cause of a "Failure to make Progress" notification: if no cycles were spent for a long time a thread might be in a deadlock, otherwise it might be loosing out when competing with higher priority processes/threads for the CPU. However, this ability can currently only be applied to a limited subset of the ddsi2 threads. Solution: All relevant ddsi2 threads now have the ability to track CPU progress. |
OSPL-10281 |
ISOCPP2 API is missing functionality to locate existing
participant by its domainId. The ISOCPP2 API offers a function dds::domain::find() that allows you to pass a domainId and that will return an existing participant attached to the Domain identified by that Id (and null when no match can be found). However, in OpenSplice the implementation for this function was still missing. Solution: OpenSplice now implements dds::domain::find() function correctly. |
OSPL-10282 |
DurabilityClient reports error on NO_DATA DurabilityClient reports an error message when taking data from a built-in topic returns NO_DATA. NO_DATA is not an error and expected behavior in this situation. Solution: When take of builtin topic for DurabilityClient returns NO_DATA don't treat is as an error. |
OSPL-10289 |
Crash when reading sequence of sequence of primitives in ISOCPP2 When using ISOCPP2 and an idl with a sequence of a sequence of a primitive or struct with only primitive fields a Segmentation Fault can occur when a read is done on this sample. Solution: The problem is fixed and the application will not crash anymore and the sample can now be accessed properly. |
OSPL-10318 |
Durability may select wrong namespace master when using the same
master priority and no persistent store configured. When the durability service is configured with a namespace with a master priority set and the same or highest master priority is used for each durability instance and when these durability services have no persistent store configured then the master selection may fail. Besides the master priority the quality of the data set is taken into account when determining which durability instance becomes the master for a particular namespace. When no persistent store is configured the initial quality of the namespace is not initialized correctly which causes that each durability instance selects it self as master. Solution: When determine the master for a namespace in case the master priority is set for that namespace then the initial quality of the namespace is initialized to 0 when no persistent store is configured. |
Report ID. | Description |
---|---|
OSPL-7783 OSPL-9602 OSPL-9837 / 17627 |
idlpp hangs if the IDL file being processed ends with a #pragma line If the last line in an IDL file is a #pragma, idlpp will consume 100% of cpu cycles and eventually fail. Solution: Add a newline after the #pragma and save file. |
OSPL-9814 |
The ddsi service incorrectly uses the reception time to
unregister the instances of a terminating remote writer. When a remote writer terminates the corresponding instance written by that writer are unregisters. The ddsi service will unregister these instances when it is notified that the remote writer has terminated. However it will use the reception time of the notification as the source time for the unregistration of the corresponding instances. This may cause that the unregister is considered out-dated when the nodes are not time-aligned. Solution: Use the source timestamp that is present in the protocol message that indicates the termination of a writer as the source timestamp for the unregistation of the corresponding instances. |
OSPL-9957 |
The creation of a topic, datareader or datawriter using an
inconsistent resource-limits QoS policy succeeds. The max_samples and max_samples_per_instance settings of the resource-limits QoS policy should be consistent which means that the max_samples should be greater of equal to the max_samples_per_instance setting. However the creation of an entity or the setting of the QoS of an entity using a resource-limits with inconsistent max_samples and max_samples_per_instance succeeds. Solution: The consistency of the resource-limits QoS policy is checked when creating an entity or setting the QoS of an entity and respectively return no entity or BAD_PARAMETER. |
OSPL-10011 |
Durability service with XML persistent store crashes during
termination. When a large amount of data is written to the XML persistent store and the disk drive containing the files is slow, operations like fsync, required to guarantee consistency of the files after i.e. power failure, can take a considerable amount of time. The durability service would incorrectly determine a thread has not made any progress, when it is blocked on disk I/O. When at the same time the service is terminated, it could potentially crash while releasing thread resources. Solution: The issue is resolved by temporarily disabling thread liveliness monitoring during disk I/O. The same mechanism was already in place for the KV persistent store. |
OSPL-10078 |
The ddsi garbage collector may block on a full writer history
cache (WHC). When ddsi garbage collector is used to cleanup the administration related to local and remote entities that are being deleted. Under certain circumstances it may occur that the garbage collector is blocked on a writer which still has data available in it's history cache (WHC), e.g. unacknowledged data. Solution: Allow the garbage collector to continue when it encounters a writer with a full WHC. |
OSPL-10115 |
DDSI time+duration implementation assumes signed overflow wraps
around. DDSI's code for adding a duration to a time relies on signed overflow wrapping around, but what happens on signed overflow is actually undefined behaviour in C. Many combinations of platform and compiler do the "expected" thing by wrapping, some provide switches to guarantee this, but in the end the code really should not rely on it. Solution: Algorithm has been modified and is no longer relies on signed overflow wrapping around. |
OSPL-10116 |
A deadlock may occur in ddsi when the LogStackTraces option is
enabled The ddsi service monitors the progress of it's threads. When it notices that a thread fails to make progress the ddsi service can log the stacktrace of the corresponding thread when the LogStackTraces option is enabled (on selected platforms only). To retrieve this stacktrace a special signal and corresponding signal handler is used. However in this signal handler a memory allocation is performed which may cause a deadlock. Solution: Pre-allocated memory is used by the signal handler to store the stacktrace. |
OSPL-10133 |
DurabilityClient accesses freed memory When enabled, the DurabilityClient functionality could attempt to access already freed memory, causing random crashes. Solution: The accessed memory has been protected to prevent it from being freed while still being accessed. |
OSPL-10135 / 17788 |
Merge policy "Catchup" may crash the durability service in case
of Exclusive ownership. When a durability service has to apply a "Catchup" policy, and some of the instances it already contains are not covered by the set that is currently being aligned, it must conclude that these instances must have been disposed on the durability master, since otherwise they would have been covered by the set that is currently being aligned. In that case it will create its own implied dispose messages for those missing instances, which may be left partially uninitialized because it is no longer able to establish the exact writerGID, timestamp and inline-qos of the original dispose message that was used to dispose the instance. However, in case of exclusive ownership the missing inline-qos may cause a segmentation violation when the durability service tries to extract the writer strength from it. Solution: The durability service will no longer try to extract the writer strength from the inline qos of an implied dispose message. |
OSPL-10155 |
Use of non-keepall coherent writers may lead to sample loss in
combination with resents When a coherent writer is used which has a history QoS policy set to KEEP_LAST then it may occur that due to resents a sample in the writer history is overwritten by a new sample. When this occurs the corresponding transaction will not become complete because of the gap that occurs in the sequence numbers that are assigned to the subsequent samples. To prevent this to occur the writer should be created with a KEEP_ALL history QoS policy setting. Solution: The creation of a group or topic coherent writer is only allowed when the corresponding history QoS policy is set to KEEP_ALL. Otherwise the creation the creation will fail or the enabling of the writer will return PRECONDITION_NOT_MET. |
OSPL-10163 OSPL-10167 / 17806 |
Segmentation Fault when using libconfig in combination with
OpenSplice and DDSI. In some cases it is possible that a Segmentation Fault can occur when using DDSI and an OpenSplice application in Single Process mode that is linked with libconfig (-lconfig). Solution: The problem was that libconfig and DDSI both have similar functions that have a different signature. The DDSI functions have been renamed to avoid this issue. |
OSPL-10184 |
Possible Durability service crash during termination when
Networking service failed to start. A few Durability service threads enter an infinite write attempt loop (that only quits when Durability terminates) if the Networking service did not start. Within this loop, the threads are not signalled to be alive. This will force an alternative path when Durability terminates, which can cause a crash. Solution: Signal the threads to be alive during the write attempt loop. |
OSPL-10236 |
The durability service over-aggressively requests groups from
remote nodes The durability service needs to exchange group information in order to request data for these groups from remote nodes. To exchange group information, group request messages are initiated, and the remote node responses with group messages. The durability service sends such request for groups way too many times, even in cases when it already knows about the groups of the remote node. Especially in situations where there are many disconnects and many groups, the number of superfluous group exchanges may increase network load and decrease performance. Solution: The number of superfluous group requests is significantly reduced in case the groups of the remote node are already known. |
OSPL-10238 / 17837 |
Potential crash in durability service during peer discovery There is a race condition in the peer discovery of the durability service, where a partially discovered (and thus partially still uninitialized peer) may already be pulled through a matching algorithm and trigger a segmentation violation. Solution: The matching algorithm has been modified to avoid processing of uninitialized peers. |
OSPL-10250 |
When there are no writers for a sample, the writer registration
for data that is being aligned must be removed. When there are no writers the writer registration must be removed so that the correct liveliness state is set for the data that is being injected by the durability service. To remove the writer registration and unregister message is generated. However, the unregister message did not fill the key fields correctly, causing the unregister message to be flawed. Solution: The key fields of the generated unregister message are set correctly. |
OSPL-10252 |
Potential crash during startup or permanent incompleteness of
durability service. A race condition during initialization between the conflict resolver thread and listener threads in the Durability Service can cause a crash in the Durability service or missing events that lead to a permanent incomplete state of kernel groups. Solution:Staring the conflict resolver thread is delayed until the listeners are initialized. |
Report ID. | Description |
---|---|
OSPL-7942 / 38552 OSPL-9825 / 17621 |
Thread specific memory leaking away after thread end Threads that are created from a user application doesn't use our thread wrapper. Thread specific memory allocated by our software stack isn't freed when these threads exits. Solution: Use OS supplied callback functionality to call a destructor function when a user created thread exits to free allocated memory. |
OSPL-9631 / 17524 |
On vxworks a deadlock may occur when creating an datareader. On vxworks the implementation for condition variables that are located in shared memory make use of a named binary semaphores. For each thread one named semaphore is allocated to be used to notify the thread when it is waiting on a condition variable. However when a thread starts waiting on a shared condition variable it could register the wrong name (id) with the condition variable. This would cause that the thread was not waken when this condition variable was signalled. Solution: For each thread a named binary semaphore with a unique name is allocated. The correct name (id) is registered with a condition variable when the thread starts waiting on that condition variable. This allows another thread or RTP to find the correct binary semaphore when signalling the condition variable and awake the thread that is waiting on that condition variable. |
OSPL-9743 / 17579 OSPL-9775 / 17600 OSPL-9776 / 17599 OSPL-9793 / 17607 OSPL-9847 / 17628 OSPL-9854 / 17633 |
Durability service crash when deleting a reader. The Durability service will obtain historical data from a remote durability service and deliver it to all local datareaders. When it has completed delivery of all historical data, it will signal to all local datareaders that it has completed delivery of all historical data. When a datareader related to the topic for which alignment has completed, is deleted at that specific time, it is possible that the Durability service will crash due to a dangling pointer. Solution: When notifying completing of the historical data alignment, individual readers are claimed first to ensure they cannot be deleted during the notification process. |
OSPL-9771 |
Group coherent transaction purging no longer depended on the
GroupCoherentCleanupDelay configuration option. The GroupCoherentCleanupDelay configuration option should be removed as all information to deduct the delay is available internally without the need to externally configure it. Solution: Group coherent transaction purging is the removal of transactions which cannot become complete anymore. A transaction cannot become complete anymore when a following transaction is completely received. Only when durability is in the process of alignment it is possible that an older transaction becomes complete, so no purging will be done while aligning. This is a replacement for the GroupCoherentCleanupDelay configuration option which always postponed purging the configured time. |
OSPL-9798 |
Improper use of a property in durability namespace administration One of the properties of a durability namespace is not properly taken into account, when the namespace is received from a fellow durability service. This causes potentially undefined behavior, including the possibility that a merge conflict is not properly resolved for that namespace. Solution: The namespace admin is now fully copied when a namespace is received by the namespaces listener thread of the durability service. |
OSPL-9807 / 17616 OSPL-9809 / 17614 OSPL-9821 |
For namespaces with a non-legacy masterPriority a delay is used in
the algorithm to select a master. This delay may prevent alignment taking place. When durability services discover each other they negotiate which one (the master) will take up the responsibility to align other durability services. Before a master is being elected there is a period where no master exists. When previous generated conflicts are being resolved in this period, they might get dropped because no master is present yet. This may result in alignment not taking place once the master has been elected. Note that the period of not having a master is unnecessary in case it is clear who will become the master. Solution: The delay to choose a master has been reduced to 0 for namespaces with a masterPriority, so that there is no risk that conflicts are getting dropped because the master is not yet selected. |
OSPL-9811/17613 OSPL-9944 |
Durability service and application deadlock or durability may
crash when concurrently deleting a datareader and marking alignment
complete Once the durability service finishes alignment of historical data, it notifies the datareader the process has completed. If that datareader is deleted by the application at the same time, both the durability service and the application deleting the datareader may run into a deadlock due to the fact that deletion of the datareader and marking completeness of historical data alignment take two locks in opposite order. Solution: The algorithm to mark historical data alignment complete has been refactored to ensure avoid pushing completeness to the readers altogether by keeping track of incomplete groups in a separate structure and letting wait_for_historical_data rely on that instead. |
OSPL-9826 |
Out of order delivery of invalid sample might throw away later
history When an invalid sample is delivered out of order to a Reader that has history depth > 1, and newer samples for the same instance are already present, then the invalid sample may legally be dropped. However, it will currently also drop all samples for the same instance that are newer than the invalid sample. Solution: Only the invalid sample itself will now be dropped in this scenario. |
OSPL-9840 |
Possible uninitialized memory reads when using Replace merge
policy. When using a replace merge policy, in some cases a sample could be evaluated right after it had been freed, resulting in uninitialized memory reads. Solution: The sample evaluation is now always performed before the sample is freed. |
OSPL-9841 / 17625 |
The creation of a Query or QueryCondition with a '(single quote)
in the expression fails. When the expression used to create a DDS query of querycondition contains an escaped '(single quote) then the creation of the query fails. Note that a single quote in a query expression should be escaped by another single quote. The cause of this error is the evaluation of the regular expression used by the query expression parser. It appears that the evaluation of the used regular expression interprets the first quote it encounters in the string as the end delimiter of the string instead for matching with the longest possible match. Solution: The regular expression used by the query parser is changed to accept the occurrences of escaped single quotes. Thus when a query string contains two consecutive single quotes this is converted to one single quote. Note that the documentation still has to be updated that it will be allowed to include a single quote in a query expression by escaping it. |
OSPL-9865 / 17634 |
Linking problem when application uses function named os_sleep
which is also defined by OpenSplice. When an application defines a function os_sleep this function may provide a conflict with the internal OpenSplice function os_sleep. This may cause a linking problem when linking the application code with the OpenSplice libraries. Solution: The name of the os_sleep function has been renamed to os_ospl_sleep. |
OSPL-9875 |
Sample requests are not combined with an already combined sample
request To reduce network load during alignment the aligner durability service attempts to combine sample requests from the alignees before it sends its response. The period to combine can be configured using configuration setting //OpenSplice/DurabilityService/Network/Alignment/RequestCombinePeriod. So if two alignee durability services requests the same data during this period, then the aligner combines these requests. If a third request appears during the period then this should also be combined, and the when the period expires all three alignees should be aligned in a single go. Due to a bug the aligner did not recognise the third request as being similar as the first two and hence the third request would not be combined. This leads to multiple responses for the same set and contributes to inefficient alignment. Solution: The algorithm to combine requests has been fixed so that combined request can also be combined. |
OSPL-9921 |
Durability may crash when datareader is deleted shortly after
creation Once the durability service finishes alignment of historical data, it notifies the datareader the process has completed. If that datareader is deleted by the application at the same time, durability may crash. Durability claims the datareader before accessing it and this prevents the datareader from being freed even if the application calls for deletion. Even though the the datareader is not freed while it is still claimed, the datareader is 'disconnected' from the subscriber during deletion. The algorithm to disconnect the datareader from the subscriber does free part of the datareader datastructure in memory and leaves the datareader in a state where that part of the datastructure is still accessible although it is already freed. Once the durability service accesses this already freed datastructure it crashes. Solution: The algorithm to disconnect the datareader has been modified to ensure the part of the datastructure that is actually freed is no longer accessible after completing the disconnect. |
OSPL-9929 |
After removing the GroupCoherentCleanupDelay option it was
possible that an obsolete group transaction was flushed to the
kernel without accessLock The removal of the GroupCoherentCleanupDelay and its replacement by not being allowed to removal obsolete group transactions during alignment shifted responsibility for the removal of the obsolete group transactions to durability. It was possible that when an obsolete group transaction contained samples that could not be discarded those samples where flushed to the kernel without accessLock, this could lead to late joining readers receive only part of the obsolete group transaction. Solution: Before flushing undiscardable samples from an obsolete transaction the kernel accessLock is taken |
OSPL-9930 |
Potential crash in resendManager When the configured networking service(s) have not attached to a new combination of a topic and partition yet, then any sample written to this combination are rejected (and hence retransmitted by the resendManager at a later time) until all configured networking services have successfully attached to it. When one of the rejected messages is an Unregister message, then the resend manager may crash while trying to retransmit it. Solution: The algorithms used to handle message rejection and retransmissions have been modified to correctly handle Unregister messages now, and thus the potential crash will no longer occur. |
OSPL-9958/17656 |
Durability not completing capability handshake after asymmetrical
disconnection Before durability services can start alignment they must exchange capabilities. Thus, capability exchange is as handshake before alignment can occur. To detect an asymmetrical disconnect multiple capabilities (>2) from the same fellow durability service must have been received. The only way this could have occurred is when the fellow was disconnected. In that case a durability service should resend its capabilities to the fellow, so that the fellow can participate again in alignment activities. However, the capability was not resend in case of an asymmetrical disconnect, and the fellow would not participate in alignment anymore after an asymmetrical disconnect withe the fellow was detected. Solution: The capability is resend to the fellow when an asymmetrical disconnect with the fellow has occurred. |
OSPL-9963 / 17660 |
WaitSet not triggered when a dispose is followed by an
unregister. When a ReadCondition was added to a WaitSet that should trigger when a sample is disposed, then it was possible that the waitset would not wake up when a dispose was followed by an unregister. This would only happen when the related reader contained only one instance. Solution: Re-evaluate the attached conditions when a matched invalid sample (a dispose or unregister) is replaced by another invalid sample. |
OSPL-9990 |
Multiple asymmetric disconnects in a row could prevent exchange
of capabilities and prevent alignment A precondition for alignment to start is that capabilities must have been exchanged and readers must have been discovered (handshake). If an asymmetric disconnect occurs multiple times after capabilities have been exchanged, then capabilities of the node that got disconnect must be exchanged again once it gets reconnected. This, however, would only occur after the first asymmetric disconnect. When multiple disconnects occur a fellow would not resend its capability because he thinks he has already did it. Furthermore, rediscovery of readers would not occur because the readers the associated DCPSSubscription messages are not available anymore, because they have been taken during the previous iteration. Both the failure to resend capabilities and the failure to rediscover the fellow's readers prevents the handshake to complete after multiple asymmetric disconnects, and hence prevents realignment. Solution: The capability message is extended with an incarnation number to differentiate between the different incarnations of fellows. In that way fellows can determine to which incarnation a capability message belongs. Furthermore, the DCPSSubscriptions to detect the readers of the fellow are not taken but read, so that they remain available for future incarnations. Both mechanisms are sufficient to determine whether the handshake has completed and realignment can start again. |
OSPL-1004 / 17669 |
Crash on cortexa9t.yocto after building custom_lib When building the product, NDEBUG was not defined while it was when building the custom_lib. This causes binaries mismatch when linking the libraries, which in turn can cause all kinds of problems like crashes. Workaround is removing "-DNDEBUG" from the custom_lib makefiles. Solution: Define NDEBUG when building the product. |
OSPL-10012 |
DDSI sometimes sends a GAP when sample still available In DDSI, the response to an ACKNACK requesting a retransmission consists of a samples in DATA (and DATAFRAG) submessages, and of GAP submessages to indicate samples that are no longer available in the writer history. The GAP is encoded as a range of sequence numbers [A,B) combined with a bitmap starting at B, where bit k indicates that the sample with sequence number B+k is no longer available (k starting at 0). In the specific case where the request is for [A,B] and all samples in [ A,B - n ] are present and those in [ B - n+1,B ] are not, and so the GAP message simply contains the interval \ [B - n, B +1 ) and no bitmap, the DDSI2 service attempts to grow the interval by locating the next available sequence number to reduce the number round-trips required between the reader and the writer. However, because of an off-by-one error, when B+1 exists, and the writer has published more data since, B+1 is erroneously included in the gap. This will (with a rare exception) cause the reader to move forward to at least B+2 and never request a retransmit of sample B. This is caused by the writer starting looking for the next sequence number beyond the gap, but taking B+1 as its starting point instead of B. (The exception is that when packet loss occurs in the retransmit and the gap was not stored in the reorder buffer for lack of space, the reader will not move forward as much and re- request samples, possibly but not necessarily recovering B+1_._) Solution: The off-by-on error is fixed. |
OSPL-10088 |
Wrong .NET version for some windows builds. In some windows versions of OpenSplice the C# languagebinding was linked against the wrong .NET version. For example the Windows 7 VS2010 build was linked with .NET 4.5 in stead of .NET 4.0. Solution: The OpenSplice installers do now comply with the following versions.
|
OSPL-10125 / 17790 |
Networking service does not halt when binding a socket failed. It is quite essential for the Networking service to be able to bind to its desired sockets. If not, then communication will fail. The Networking service continued to run when such a socket bind failed, indicating problems only by means of an error trace. The application being none the wiser. The Networking service should halt when it fails to bind sockets. Solution: Networking service will halt in an error state (using the FailureAction/systemhalt configuration will halt the complete domain in that case) when a socket bind fails. |
TSTTOOL-448 / 17651 TSTTOOL-449 / 17654 TSTTOOL-452 / 17648 |
New shell command simplifies Vortex Tester Python Scripting
installation and usage In order to use Vortex Tester Python Scripting, a user had to download, install and configure a Jython engine, and then at each use, start that engine with specific command line options. Doing this was error prone. Solution: A new osplscript command is now included with Vortex OpenSplice. To install and configure Python Scripting, the user must only download the Jython installer to the current directory, and run osplscript from a console window. Once configured, osplscript will start the Python Scripting environment; no special command line options are required. Finally, osplscript accepts all standard Python/Jython command line arguments. |
TSTTOOL-456 |
Creating a group coherent subscriber in Tester fails to add
readers, and throws exceptions in log. When specifying "Create as group" from the add readers dialog, the data readers fail to initialize and the DDS poller thread throws exception logged in OSPLTEST.log. This is because from OpenSplice 6.8.0, subscribers that have the presentation policy set to access scope = GROUP and coherent access = true are created in the disabled state. Solution: Tester now prevents operations being invoked on the group coherent subscriber until it is enabled, which is done automatically after the data readers are created. Also added error message to add reader dialog to prevent creating new readers under existing group coherent subscriber. |
Report ID. | Description |
---|---|
OSPL-4891 |
Java and C++ RMI incompatibility. The RMI bindings for Java and C++ use different topic names. Therefore a Java server and C++ client (or vice-versa) are not compatible with each other. Solution: A parameter (--RMILegacyTopicNames) has been added to the C++ RMI binding. By default it is enabled to ensure backwards compatibility with previous releases. When disabled, topic names will match the Java binding therefore a C++ RMI client/server will be able to communicate with a Java RMI client/server. For more information please check the RMI Getting Started Guide (section "Runtime Configuration Options"). |
OSPL-9183 |
Java 5 API read/take gives back already processed samples. In the Java 5 API when doing a read or a take with an own allocated result list it might be possible that already processed data is returned in the result list. Solution: Only not processed data will be returned |
OSPL-9274 |
Remove already deprecated MMF durability persistency option The MMF store for durability has long been deprecated and now has been removed. Solution: Support for the MMF store has been removed from the durability service after beeing deprecated for a long time. Switch to the new KV-persistency is transparant. |
OSPL-9383 |
Missing flags in SAC DDS_STATUS_MASK_ANY DDS_SUBSCRIPTION_MATCHED_STATUS and DDS_PUBLICATION_MATCHED_STATUS are not included. Solution: The missing flags have been added. |
OSPL-9625 |
DDSI asynchronous delivery mode behaviour change The DDSI2 service delivers data to the kernel either synchronously or asynchronously, depending on the latency budget and the transport priority QoS of the writer. If the latency budget is large enough or the transport priority is low enough (both configurable), delivery is asynchronous. Solution: The behaviour of asynchronous delivery has been changed to not drop data when the delivery queue is full, but rather to wait until there is once again room available in the queue. This behaviour more closely resembles the behaviour with synchronous delivery when a reader has reached its resource limits, but more importantly it significantly increases the stability of the throughput at very high sample rates. The default settings are such that all data is by default delivered synchronously. This change only has an impact if asynchronous delivery has been explicitly enabled and previously samples were dropped because of a full queue. |
OSPL-9626 |
DDSI now gives allows configuring Heartbeat timing The DDSI2 service did not allow configuring the timing of the Heartbeats sent by writers to inform the matching readers of the presence of (unacknowledged) data. While the timing parameters were fine for most networks, on long-latency networks they could result in multiple Heartbeats being sent before an acknowledgement could have been received. This is obviously bad for network utilisation. Solution: The parameters are now configurable in the Internal/HeartbeatInterval setting. The default values are identical to the previously used values, and there is no change in behaviour unless different values are configured. |
OSPL-9627 / 17521 |
Application crash during write When inserting a writer sample in the writer history first the position where the sample should be inserted is determined. When the history is full then an older sample is removed from the history. A crash occurs when the sample that has been removed is also the position in the history where the new sample has to be inserted. Solution: When the insertion point equals the sample that is removed the insertion point should be updated to point to the next sample in the history. |
OSPL-9736 |
QoS Provider now accepts durability-service policy in DataWriter QoS. The OMG DDS4CCM Specification defines the QoS Provider API as implemented by OpenSpliceDDS (See QoS Provider documentation). An oversight in the specification resulted in the inclusion of a durability-service policy in the DataWriterQoS. OpenSpliceDDS used to reject QoS profiles that include this policy, since it is not a valid DataWriterQoS policy. However since it is included by the specification (and also in many example xml profiles), this behaviour was deemed inconvenient. Solution: The OpenSpliceDDS QoS Provider implementation was changed to no longer reject this policy, it is now ignored, reported in the info log but does not prevent creating and using a QoS Provider. |
OSPL-9737 / 17559 |
DDSI transmission blocks on high watermark without warning. When //OpenSplice/DDSI2Service/Internal/ResponsivenessTimeout is infinite (which is the default), then the DDSI transmission will block on the high watermark without a warning trace when there is an unresponsive reader in the system. Solution: A warning is added that will be traced after a small while when DDSI is waiting on the high watermark. |
OSPL-9747 |
C99 api function dds_readcondition_create returns null. Implementation does not correctly copy the mask, causing creation of read condition to fail. Solution: Fixed the error in the mask copy routine. |
OSPL-9756 |
C# RoundTrip example reports wrong roundtrip times The C# example is using the C# DateTime.Ticks property, which is increased each 100nSec. The example was dividing this with 1000 before using it for calculation, loosing resolution and incorrectly assuming this would results in uSec. Solution: Now, the full 100nSec resolution clocktick is used in the calculation, the conversion to uSec is done during printing of the results by dividing the result by 10. |
OSPL-9910 |
Shapes Demo cannot set Best Effort DataWriters The Shapes Demo was only able to set the Reliable policy while assuming that the default would be Best Effort. The Shapes Demo was ported from using isocpp to using isocpp2. The default QoS values of isocpp2 are different from isocpp and follows the standard but that meant that the default is now Reliable (for DataWriter). The result was that the Shapes Demo was not able to change the default Reliable to Best Effort. Solution: Shapes Demo does now set the various QoS settings explicitly and does not dependent on default QoS values any more. |
OSPL-9911 OSPL-9956 |
Unused variable warning when building application using isocpp2 API Warning at include/dcps/C++/isocpp2/dds/domain/detail/TDomainParticipantImpl.hpp:45:57 Solution: Removed unused variable. |
OSPL-9930 |
Potential crash in resendManager When the configured networking service(s) have not attached to a new combination of a topic and partition yet, then any sample written to this combination are rejected (and hence retransmitted by the resendManager at a later time) until all configured networking services have successfully attached to it. When one of the rejected messages is an Unregister message, then the resend manager may crash while trying to retransmit it. Solution: The algorithms used to handle message rejection and retransmissions have been modified to correctly handle Unregister messages now, and thus the potential crash will no longer occur. |
OSPL-9950 |
When the c99 read/take operation returns no samples then calling
return_loan may cause a crash. When the c99 read/take operation does not return samples and the return_loan operation is called then a crash may occur when the same buffer is used again to read/take samples. The cause is that when no samples are read no buffer is allocated however the not allocated buffer is incorrectly added to the loan administration. Solution: When the read/take operation returns no samples then the provided buffer should not be added to the loan administration. |
OSPL-9953 / 17653 |
Including the DCPS C-API header file in a C++ program may give a
compilation warning An include file that is included by the DCPS C-API header file dds_dcps.h defines the struct os_stat and a function with the name os_stat. When included in a C++ program the name of the struct shadows the name of the function. This may cause a compilation warning. Solution: The name of the struct is redefined to be different from the name of the corresponding function. |
Report ID. | Description |
---|---|
OSPL-9708 |
Built-in topics included in all namespaces. When using RTNetworking, all namespaces would be considered to contain the built in topics. This could cause way too many conflicts as well as improper behaviour on merging the built in topics. Solution: The built-in topics are only contained in the automatically created namespace that is intended to properly merge built in topics. |
OSPL-9768 |
Userclock configuration name and reporting attribute types mixed up in the configurator. The Userclock configuration name and reporting attribute types are mixed up in the configurator. The name is set as Boolean and the reporting attribute is set as String. Solution: The defect is fixed and the name is now a String again and the reporting a Boolean. |
OSPL-9769 |
Alignment not taking place due to unfortunate timing. When two nodes see each other, one of them is going to be selected as the master for the other. The alignee node that does not become master waits for the master to update its state before it will acquire the data from the master. It is possible that the master raises its state before the alignee node starts waiting for the update. Effectively, this means that the alignee node will wait for an update that never comes, because it already has occurred. Solution: The node that does not become master will always request data from its master instead of waiting for a state update. |
OSPL-9785 |
CATCHUP policy may incorrectly dispose recently arriving data. When the durability service performs a catchup policy, it intends to replace its current data set with the set from its aligner. Every instance that is in its current data set but not in the aligned data set should be disposed. However, the algorithm used to insert this dispose message might incorrectly apply it to data that has arrived AFTER the request for the catchup but BEFORE the completion of this request, thus accidentally disposing data that should still be considered ALIVE. Solution: The algorithm used to insert the DISPOSE message has now been modified to apply it only to data that is older than the time of the catchup request. |
OSPL-9208 |
DDSI not sending an SPDP ping at least every SPDPInterval DDSI has a Discovery/SPDPInterval setting that is meant to set an upper bound to the SPDP ping interval, that is otherwise derived from the lease duration set in the //OpenSplice/Domain/Lease/ExpiryTime setting. The limiting only occurred when the lease duration is > 10s. Solution: The limiting has been changed to ensure the interval never becomes larger than what is configured. |
OSPL-8958 |
DDSI can regurgitate old T-L samples for instances that have
already been unregistered DDSI maintains a writer history cache for providing historical data for transient-local writers and for providing reliability. An instance is removed from this cache when it is unregistered by the writer, but its samples are retained until they have been acknowledged by all (reliable) readers. Already acknolwedged samples that were retained because they were historical data could survive even when the instance was removed. When this happened, a late-joining reader would see some old samples reappear. Solution: deleting an instance now also removes the already acknowledged samples from the history. |
OSPL-9097 |
DDSI transmit path can lock up on packet loss to one node while another node has crashed A successful retransmit to one remote reader while another remote reader that has not yet acknowledged all samples disappears (whether because of a loss of connectivity or a crash), and when all other remote readers have acknowledged all samples, and while the writer has reached the maximum amount of unacknowledged data would cause the transmit path in DDSI to lock up because the writer could then only be unblocked by the receipt of an acknowledgement message that covers a previously unacknowledged sample, which under these circumstances will not come because of the limit on the amount of unacknowledged data. Solution: deleting a reader now not only drops all unacknowledged data but also clears the retransmit indicator of the writer. |
OSPL-9096 |
Durability service DIED message even though the durability service is still running The d_status topic is published periodically by the durability service to inform its fellows of its status. By using a KEEP_ALL policy, the thread writing the status message and renewing the serivce lease could be blocked by a flow-control issue on the network, which could cause the durability service to be considered dead by the splice daemon when in fact there was no problem with the durability service. Solution: use a KEEP_LAST 1 history QoS policy for the writer. |
OSPL-9067 |
Large topics are sent published but not received Loss of the initial transmission of the final fragments of a large sample failed to cause retransmit requests for those fragments until new data was published by the same writer. Solution: ensure the receiving side will also request retransmission of those fragments based on heartbeats advertising the existence of the sample without giving specifics on the number of fragments. |
OSPL-9077 / 00016820 |
Potential crash in durability service during CATCHUP policy The durability service could crash while processing a CATCHUP event. This crash was caused by the garbage collector purging old instances while the CATCHUP policy was walking through the list of instances to do some bookkeeping. Solution: The CATCHUP policy now creates a private copy of the instance list while the garbage collector is unable to make a sweep. This private list is then used to do the bookkeeping. |
OSPL-9068 / 00016813 |
Catchup policy may leak away some instances When a node that performs a catchup to the master contains an instance that the master has already purged, then the node catching up would need to purge this instance as well. It would need to do this by re-registering the instance, inserting a dispose message and then unregistering this instance again. However, the unregister step was missing, causing the instance to effectively leak away since an instance is only purged by the durability service when it is both disposed AND unregistered. Solution: The durability will now both dispose AND unregister the instance at the same time. |
OSPL-9081 / 00016824 |
Potential deadlock in the OpenSplice kernel The OpenSplice kernel has a potential deadlock where two different code paths may claim locks in the opposite order. The deadlock occurs when one thread is reading/taking the data out of a DataReader while the participant's listener thread is processing the creation of a new group (i.e. a unique partition/topic combination) to which this Reader's Subscriber is also attached. Solution: The locking algorithm has been modified in such a way that the participant's listener thread no longer requires to hold both locks at the same time. |
OSPL-8956 |
Temporary blacklisting of remote participants in DDSI2 The DDSI2 service now provides an option to temporarily block rediscovery of proxy participants. Blocking rediscovery gives the remaining processes on the node extra time to clean up. It is strongly advised that applications are written in such a way that they can handle reconnects at any time, but when issues are found, this feature can reduce the symptoms. Solution: A new setting in the DDSI section of the configuration has been added: Internal/RediscoveryBlacklistDuration along with an attribute Internal/RediscoveryBlacklistDuration [@enforce]. The former sets the duration (by default 10s), the second whether to really wait out the full period (true), or to allow reconnections once DDSI2 has internally completed cleaning up (false, the default). It strongly discouraged to set the duration to less than 1s. |
OSPL-9071 |
v_groupFlushAction passes a parameter that is not fully initialized. Valgrind reported that the v_groupFlushAction function passes a parameter that is not fully initialized. Although one of these parameters was evaluated in a subsequent function invocation, it never caused issues because the value was only used as an operand for a logical AND where the other operand was always FALSE. Solution: All attributes of the parameter in question are now explicitly initialized. |
OSPL-9055 |
Potential Sample drop during delivery to a local Reader In some cases, a dispose followed by an unregister does not result in NOT_ALIVE_DISPOSED state on a Reader residing on the same node as the Publisher. In those cases, the Reader has an end state set to NOT_ALIVE_NO_WRITERS, and reports that a sample has been Lost. Solution: We have no clue what could cause this behaviour, but added some logging to capture the context of the erroneous sample drop. This is just a temporary measure, and will be reverted when the root cause has been found and fixed. |
OSPL-9056 |
Potential deadlock during early abort of an application When an application aborts so quickly that the participant's leaseManager thread and its resendManager thread have not yet had the opportunity to get started, then the exit handler will block indefinitely waiting for these threads to exit the kernel. However, both threads are already blocked waiting to access a kernel that is already in lockdown. Solution: The constructor of the participant will not return before both the leaseManager and resendManager threads have entered the kernel successfully. |
OSPL-8953 |
Potential deadlock between reader creation and durability notification A thread that creates a new DataReader and a thread from the durability service that notifies a DataReader when it has completed its historical data alignment grab two of their locks in reverse order, causing a potential deadlock to occur. Solution: The locking algorithm has been modified so that these two threads do no longer grab both locks in reverse order. |
OSPL-8886 |
Durability failure to merge data after a short disconnect When the disconnection period is shorter than twice the heartbeat a durability service may not have been able to determine a new master before the node is reconnected again. In that case no master conflict is generated. In case the durability service is "late" in confirming a master it might even occur that the master has updated its namespace, but the namespace update is discarded because no confirmed master has been selected yet. As a consequence no request will for data will be sent to the master, and the durability service will not be aligned. Solution: In case a durability service receives a namespace update for a namespace for which no confirmed master is selected yet, the update is rescheduled for evaluation at a later time instead of discarding the update. |
OSPL-8948 / 16755 OSPL-8987 |
Race condition between durability data injection and garbage collecting of empty instances The durability service cached instance handles when injecting a historical data set in a way that could result in the historical samples being thrown away if the instance was empty and no known writers had registered it. Solution: the instance handle is no longer cached.. |
OSPL-8971 |
Catchup policy may incorrectly mark unregistered instances as
disposed. When an instance is unregistered on the master node during a disconnect from another node that has specified a CATCHUP policy with that master, then upon a reconnect that unregister message will still be delivered to that formerly disconnected node. However, the reconnected node will dispose all instances for which it did not receive any valid data, so if the unregister message it the only message received for a particular instance, then its instance will be disposed. Solution:The Catchup policy is now instructed to dispose instances for which it did not receive any valid data OR for which it did not receive any unregister message. |
OSPL-8984 |
DDSI handling of non-responsive readers needs improvement When a writer is blocked for ResponsiveTimeout seconds, DDSI will declare the matching proxy readers that have not yet acknowledged all data "non-responsive" and continue with those readers downgraded to best-effort. This prevents blocking outgoing traffic indefinitely, but at the cost of breaking reliability. For historical reasons it was set to 1s to limit the damage a non-responsive reader could cause, but past improvements to the handling of built-in data in combination with DDSI (such as fully relying on DDSI discovery for deriving built-in topics) mean there is no longer a need to have such an aggressive setting by default. Solution: The default behaviour has been changed to never declare a reader non-responsive and maintain reliability also when a remote reader is not able to make progress. The changes also eliminate some spurious warning and error messages in the log files that could occur with a longer timeout. |
OSPL-8920 |
DDSI2 Crash Version 6.6.3p4 introduced a fix for OSPL-8872, taking the sequence number most recently transmitted by a writer when it matched reader into account to force heartbeats out until all historical data has been acknowledged by the reader. The change also allowed a flag forcing the transmission of heartbeats informing readers of the availability of data to be set earlier than before in the case where the writer had not published anything yet at the time the reader was discovered. While logically correct, this broke the determination of the unique reader that had not yet acknowledged all data in cases where there is such a unique reader. This in turn could lead to a crash. Solution: the aforementioned flag is once again never set before a sample has been acknowledged. |
OSPL-8974 |
Durability conflict scheduling fails when multiple namespaces have the same policy and differ only in topic names Durability checks for conflicts between fellows (master, native and foreign state) that may require merging data whenever it receives a "d_nameSpaces" instance. If a conflict is detected, it enqueues it for eventual resolution, but only if an equivalent conflict is not yet enqueued. Testing for equivalency is done by checking: conflict kind, roles and local and fellow namespaces. However, the name space compare function (d_nameSpaceCompare) did not take the name into account, nor the full partition+topic expressions. The consequence is that when namespaces A and B have identical policies and differ only in the topic parts of the partition/topic expressions, a conflict for namespace A would be considered the same as a conflict for namespace B. The result would be a failure to merge data in B. Solution: The comparison now takes the name of the namespace into account. The configuration is required to have no overlap between namespaces and to have compatible namespace definitions throughout the system. The name alone is therefore sufficient. |
OSPL-8973 |
Additional durability tracing when verbosity is set to FINEST Durability has been extended with additional tracing in the processing of namespace definitions received from fellows, in particular when checking for master conflicts. |
OSPL-6112 |
An asymmetrical disconnect may lead to an inconsistent data state. The durability service relies on a reliable and symmetrical network topology. Every once in a while it is possible to experience temporary network hickups resulting in temporary asymmetrical network topology (durability service A sees B, but B does not see A). This can typically occur due to high load on one of the machines, or a networking problem. Such asymmetrical disconnects may lead to an inconsistent data state. To recover from such situation the durability service must recognize when such asymmetric disconnect occurs, and trigger the alignment actions to make the data state consistent again. Solution: Asymmetric disconnect situations are detected and the correct alignment actions are triggered to recover from the inconsistent data state. |
OSPL-9136 |
Protobuf isocpp2 example fails to build on E500mc build The isocpp2 protobuf example fails to build on E500mc due to the fact that the wrong compiler is referenced in the example build script. Solution: The build script has been modified to reference the correct compiler. |
OSPL-9267 / 16945 OSPL-9270 / 44079 OSPL-9334 |
Coherent set create, delete, unregister, recreate causes the
recreate to be lost When an unregister message was received from a coherent writer the connection (pipeline) between the group instance and reader instance was always destroyed immediately while the registration for the group instance was not removed immediately. The destruction of the pipeline with a still valid registration could cause samples written after the unregister to be dropped. Solution: On unregistration from a coherent writer immediately destroy the pipeline and process the unregistration for the group instance, like done for non-coherent writers. |
OSPL-9299 |
Durability shall allow configuring a master priority Currently it is almost impossible to control which durability service becomes master. This leads to situations that durability instances that should not become master can become master. Furthermore, it is currently not possible to indicate that a durability instance can act as aligner, but should not take up the responsibility to act as master for other instances. To provide more control which node becomes master for a namespace, the //OpenSplice/DurabilityService/NameSpaces/Policy[@masterPriority] attribute can be specified. This is an optional attribute that specifies a value between 0 and 255. Value 0 means that a node will never become master, and value 255 indicates that the legacy master selection will be used. The default is 255. When the masterPriority is specified between 1 and 254, nodes with a higher masterPriority should become master. See the deployment manual for more information.Note that there are currently a few limitations:
Solution: A masterPriority has been implemented on the namespace policy that specifies the eagerness of a durability service to become master for a namespace. This gives a user more control. Note that mastership handover in case a better but late joining master arrives is currently unsupported. This will be addressed in the next release (see OSPL-9358 in known issues list). |
OSPL-9329 |
Multiple concurrent topology changes may cause durability to
switch to majority voting The current implementation of the master selection algorithm in durability falls back to majority voting in case no agreement for a master can be established in a limited number (4) of election rounds. When various topology changes occur concurrently then it is likely that no agreement is reached in the maximum number of rounds. The master selection algorithm then falls back to majority voting. Because different durability services may have a different view of their world, majority voting may lead to different masters in the system and an inconsistent end state for the data on various nodes. Solution: To decrease the likelihood that the master selection algorithm falls back to majority voting we have increased the maximum number of rounds to 50 before the fallback will occur. This may lead to a longer master selection phase in case multiple topology changes are occurring concurrently, but increases the possibility that the end state is consistent. |
OSPL-9333 / 17011 |
DDSI2 private memory growth when unregistering T-L
instances/deleting entities without any peers present The DDSI2 service was holding on to the unregister messages for transient-local instances inside the writer history while disconnected from the rest of the world. When another node showed up, it would receive and acknowledge all these messages, but without significant effect on the receiver as they only described the unregistering of unknown instances/entities. Reception of an acknowledgement of a new T-L message or creation of a new entity sent while another node is connected would clear out these unregisters. Solution: When no peer exists, do not retain the unregister messages in the writer history cache. |
OSPL-9335 / 17009 |
Spliced not inheriting priority from "ospl start" TThe "ospl start" command always started spliced with a default priority in a default scheduling class (timeshare on typical platforms), independent of the actual priority of the "ospl" tool itself. This is contrary to what one would expect, and moreover made it impossible to force the threads that are not independently configurable to run at a certain priority. Solution: The behaviour of "ospl start" has been changed so that "spliced" now inherits its priority. |
OSPL-9365 |
The durability service can hang when requesting data from
multiple fellows and some of them leave after having provided a
partial set When the durability service determines it needs to request samples from multiple fellows to align with them (typically after becoming a master), it greedily dedups the incoming samples to reduce memory requirements before applying the configured merge policy. When a fellow disappears after providing some but not all samples in the set, the ones already received need to be removed from the received set. The administration of the received, dedup'd set was lacking the information needed to do this correctly. This then potentially resulted in the durability service waiting forever for the set to become complete. Solution: The durability service now annotates each sample with the set of fellows that have provided it. |
OSPL-9388 / 17030 |
Durability service might deadlock when the networking queue is
flooded. When the network queue is overrun by the durability service, the normal mode of operation is to sleep a bit and retry again later. However, there is a slight chance that the sending thread of the network service that needs to make room again by consuming elements in the queue will indirectly block on the sleeping thread in the durability service itself. Solution: The network service can no longer indirectly run into a lock that is held by the durability service while the network queue is flooded. |
OSPL-9408 |
Memory is leaked when readers request historical data. If a reader requests historical data, some internal administration is created. A small part of this is not freed, resulting in a memory leak. In case readers are created and destroyed in a loop, the total amount of memory that is lost can become significant. Solution: The identified leakage areas have been fixed. |
OSPL-8017 |
DDSI2 did not renew a participant lease for every received message The DDSI2 service discovers remote participants and automatically deletes them if they do not renew their leases in time. The lease renewal was tied to reception of data and of explicit lease renewal messages, and hence reception of, e.g., an acknowledgement would not lead to a lease renewal, even though it obviously requires the remote participant to be alive. Solution: DDSI2 now renews leases regardless of the type of message |
OSPL-9485 |
Durability may mark aligned samples as duplicates while they are
not and drop them Durability combines incoming samples into a set while filtering out duplicates, but fails to take into account all fields that determine its uniqueness. This can lead to dropping samples from the set that would have affected the state of the system or result in durability never considering the set complete. Solution: The function to compare samples to determine uniqueness has been modified to ensure all relevant attributes are taken into account. |
OSPL-9501 |
Durability service may wait forever on asymmetric disconnect with
a fellow When a fellow is asymmetrically disconnected while samples are being requested it is possible that one or more of these sample requests are not received by the fellow, or that the response is lost. The durability service that sends the sample requests will be waiting for answers until all answers are received. But since the fellow was asymmetrically disconnected and requests or responses may have been lost, the durability service may wait forever for answers that will never come, thereby blocking progress of the durability service. Solution: When a fellow is asymmetrically disconnected all pending sample requests are discarded. In that case the durability service will not wait for answers anymore and continue operation. Reappearance of the fellow may lead to new alignment actions that are handled consecutively. |
OSPL-9502 |
Durability not always syncing with new master Durability services negotiate a master to align data from. It has been observed that a deceased fellow got chosen as master. It is evident that chosing a deceased fellow as master will not lead to alignment and may lead to inconsistent data state. Solution: The algorithm that caused the deceased master to be chosen has been modified to prevent this situation from happening. |
OSPL-9503 |
Durability may wait for responses from already disconnected fellow To keep a consistent state durability services may request data from each other. When a durability service is waiting for answers to such requests from a fellow and that fellow 'leaves', the durability service cleans up the pending requests correctly. However, when the fellow leaves while the durability service is concurrently sending out the requests, it is possible that not all requests are cleaned up correctly. In that case the durability service will keep waiting for data from the fellow that has left forever. This stalls the alignment process until the service is completely restarted. Solution: Slight changes in the locking strategy prevent concurrent sending of requests and cleaning them up on fellow disconnection. |
OSPL-9514 |
Memory leak in core Each time a DataReader is created a small amount of memory is leaked due to a missing free. Solution: The missing free has been added. |
OSPL-9522 |
DDSI2 transmit path can lock up when two partially asymmetric
disconnections overlap Consider a situation where: on node A the lease of B expires, and then on node B the lease from A expires while A rediscovers B, and then B rediscovers A. Then imagine that just before the lease of A expires on B, a heartbeat by a transient-local or endpoint discovery writer on A is multicasted and received, processed and responded to by B, ACK'ing everything. Then, in the particular case where A receives that acknowledgement just after rediscovering B, A will assume that B has received and ACK'd everything, and not send heartbeats out on B's behalf. In this situation, B will still send pre-emptive ACKs until it receives a heartbeat from A; this will trigger a retransmit of the missing data. If some of this data is lost on the network, and no other readers exist to force A to multicast heartbeats, then the DDSI specification requires B to wait with requesting a further retransmit until a heartbeat is received. (The most likely reason for this to happen is writing new data on A.) In the particular case where the lost data concerns endpoint discovery data, full connectivity will then not be established properly. If B has missed out on the definition of a data writer, then B will not acknowledge any data writer by that writer as it doesn't know the writer, which may cause the transmit buffers on A to fill up, and hence may lock up the DDSI transmit path on A until the reader on B is declared non-responsive. Solution: DDSI2 will now automatically re-request a retransmit after a configurable amount of time, by default 1s, but this may be disabled for strict compliance with the specification. Under normal circumstances, the retransmits will arrive much sooner, and so no additional network traffic will typically be generated by this change. |
OSPL-9533 |
Durability with master priority 0 not retrieving data after
master reconnect The durability service clears the merge-state and master of its local namespaces when it is not an aligner (or has taken a mastership poison pill) and a fellow is removed with the same role even if that removed fellow was not the master of the name-space. Effectively this means the master is cleared even though it is still connected. Secondly, the durability service does not request the data from the new master if its master priority = 0. Solution: Only reset the master and clear the state of a nameSpace if the removed fellow actually is the master of the nameSpace. Additionally, durability with master priority 0 now requests data from master after resolving a master conflict. |
OSPL-9433 / 17161 |
Purging instances while merging hasn't finished may result in
wrong end-state of an instance When an instance is purged because it reached its end-state while merging, the merge may reintroduce the instance falsely. This can for example happen when an instance is contained in the merge, but while the merge is ongoing, the instance is disposed and unregistered. In that case the instance can disappear. When the merge is finished, there is no way for the middleware to tell whether the instance contained in the merge isn't valid anymore. Solution: While there is a merge or regular alignment action ongoing, purging of instances that have reached their end-state is deferred. There is a known limitation to the current implementation w.r.t. the use of RTNetworking. Furthermore, there is a small time-frame in which this suppression may not be effectuated correctly if a node is (re)connecting at the end of merging. This will be fixed in an upcoming release and is covered in OSPL-9612. |
OSPL-9542 |
Raising the namespace state for a namespace while there are still
pending conflicts for the namespace may cause scalability issues.
Whenever nodes (re)connect the durability service has a potential inconsistent state. To solve these the durability service will generate conflicts and handles them one by. When a conflict has been handled the durability service currently can raise the state of one or more namespaces for which it is master. Raising the state causes the slaves to acquire data. Currently, the state is potentially raised after each conflict. In case there a pending conflicts for the same namespace, raising the state is is not very smart because is causes slaves to request data multiple times. A better approach would be to raise the state after the last conflict for the namespace has been handled. This would cause slaves to requests the data only once instead of multiple times. Doing this will improve scalability. Solution: The state of a namespace is now only raised when there are no more pending conflicts. |
OSPL-9561 |
When unregistration creates implicit registration the
unregistration is not processed When an unregistration message is the first message received it is used to create a registration message. This registration message has the same write time, gid and sequence number as the unregistration and even though the state differed it was dropped as duplicate. Solution: When comparing messages the state is now also used to determine if its a different message. |
OSPL-9566 |
Possible durability crash when injecting transactions when for
one of the writers the DCPSTopic is not yet received When injecting an EOT message via durability a crash could happen when the list of writers available in the EOT was evaluated and for one of the writers in the list the DCPSTopic was not yet received while the DCPSPublication was. The combination of DCPSPublication and DCPSTopic was used for discovery of the writer and the implementation assumed the it always received the DCPSTopic before the DCPSPublication, this assumption would lead to the crash. Solution: When discovering the writers in the EOT now only the DCPSPublication is used. |
OSPL-9593 |
Overflow of internal metadata reference counts Loading the same topic type into OpenSplice in a loop could overflow the reference counts of some internal metadata objects. This in turn could lead to freeing important metadata, eventually crashing the application Solution: The reference counts are now forced to a maximum value if they would otherwise overflow. |
OSPL-9598 |
Deletion of DomainParticipant in ISOC++v2 API may trigger invalid
handle detected error report When an ISOCPP2 application creates 2 participants, and closes the 2nd participant, the info log will contain a message like this: "invalid handle detected: result = U_RESULT_ALREADY_DELETED, Entity = 0x7ffed4c045f0 (kind = U_LISTENER)". This was caused by the fact that the explicit deletion of the Listener in question was postponed til AFTER the deletion of its participant, which already implicitly deletes it as well. So the Listener effectively got deleted twice. Solution: The listener in question is lo longer deleted explicitly, but its deletion is left to the participant which will eventually delete it implicitly. This way the listener can never be deleted twice. |
OSPL-9621 |
When all data for a partition/topic combination has become
complete, this new state is not always advertised The durability service is responsible for keeping durable data sets consistent. When a durability service has retrieved all data for a particular partition/topic combination, the data set is marked is complete. In some cases the event that the data is complete is not advertised. This may stall other durability services that wait for the data set to become complete. Solution: When the set of data for a partition/topic combination has become complete, the completeness is advertised. This may trigger alignment from remote nodes. |
OSPL-8883 OSPL-8970 / 16949 OSPL-9105 / 16841 |
Semantics for instance state not clearly defined after reconnect. The effects of a disconnect on the instance state of your data is clearly defined: all data originating from the disconnected node(s) becomes NOT_ALIVE_DISPOSED in case it was written with a writer using auto_dispose_unregistered_instances=TRUE or NOT_ALIVE_NO_WRITERS in case it was written with a writer using auto_dispose_unregistered_instances=FALSE. However, the effect of a reconnect to the previously disconnected node on these instance states was not clearly defined. Solution: We have now specified the following behavior for a previously disconnected instance:
|
OSPL-9069 / 16818 |
DDSI2 reports an inscrutable error when presented with a topic
definition it doesn't support DDSI2 reported errors such as "handleDataReader: new_reader: error -1" for topics it doesn't support. This typically means the length of the serialised key exceeds 32 bytes (counting strings as 4 bytes), but this was not in any way expressed by the error message. Solution: The error messages now properly identify the problem, including topic and type names. |
OSPL-9111 |
In case client durability is configured no historical data is
retrieved when a non-volatile reader is enabled. The reference manuals indicate that when a non-volatile reader is enabled, a request for historical data should be send out. In case client durability is configured this request was not sent out. Consequently, no historical data is retrieved when the reader becomes enabled. Only after an explicit call to wait_for_historical_data historical data will be retrieved. Solution: When a non-volatile reader becomes enabled a request for historical data is being send out, causing historical data to be retrieved. |
OSPL-9243 / 16947 |
Instances may keep track of unlimited amounts of invalid samples. When an event occurs on a reader instance (like a dispose or unregister) an invalid sample carrying the event context is inserted in the history of this reader instance. These invalid samples may eventually be pushed or taken out of the history, but there are scenario's in which neither will occur (for example when I repeatedly disconnect/reconnect to the producer of the instance without any new data being added in the process while I do not actively take the resulting invalid samples out of the reader). In such cases the Reader might collect an unlimited amount of invalid samples claiming more and more resources in the process without any restrictions. Solution: Each instance can now only have 1 invalid sample at most and an invalid sample can only be located at the tail of the instance. Newer invalid samples will simply replace older invalid samples. However, even a removed invalid sample will still cause an increment in the generation counts of the samples following it. |
OSPL-9358 |
Mastership handover. Until now late joining nodes always slave to an existing master. With the introduction of master priorities (see //OpenSplice/DurabilityService/NameSpaces/Policy[@masterPriority]) it is possible to assign a mastership preference to nodes. In case a late joining node has a higher preference than the existing master, the late joining node should become master instead of slaving to the existing node. This requires handover of mastership. Solution: Late joining nodes with a higher mastership preference than the current master now trigger a merge so that handover of mastership is established. |
OSPL-9437 |
When using a topic with history depth (default) it is possible
that a group coherent transaction never becomes complete when live
data and alignment data are received simultaneously. When the durability service aligns a group coherent transaction for which samples have been pushed out of the history due to history limits and part of that same transaction is received via live communication it is possible that the transaction never becomes complete on the federation that's being aligned. When the aligned federation received the EOT via live communication and a special EOT via alignment, the internal EOT counter gets an invalid value which results in the group never being marked as complete. Solution: Reception of EOT via live communication and alignment is now handled correctly. |
OSPL-9478 |
Reader not getting locally stored transactional historical data
when last message was an unregistration. When creating a reader on a federation that has open historical transactions and the last message in that transaction is an unregistration the historical open transaction is not injected in the reader as the injection tries to use a non-existing pipeline. Solution: No longer use the pipeline for injecting locally stored transactional historical data into the reader. |
OSPL-9486 |
Complete group coherent set not flushed when completed by
publicationInfo notification on the reader When a publicationInfo notification on the reader completes a group coherent set the set was not flushed when it became complete. In case no transactions would follow the readers or subscriber would never be notified. After calling begin_access the set was available for the readers. Solution: PublicitionInfo notification on the reader now flushes the set to the readers |
OSPL-9534 |
The reader operation take_next_instance might skip instances with
invalid samples. The reader operation take_next_instance is expected to return the next matching instance starting from the instance identified by the specified instance handle. However, in case the next matching instance contained only an invalid sample, instead of returning this invalid sample the instance would be skipped and the next matching instance would be returned instead. Solution: The algorithm to identify the next matching instance has been corrected to no longer skip an instance that has only got an invalid sample. |
OSPL-9562 |
Fake heartbeat unregistered while never written The splice daemon is responsible for writing fake heartbeats on receiving a DCPSParticipant form a remote federation, the writing of these heartbeats happens conditionally based on existing heartbeats. The splice daemon unconditionally unregisters its fake heartbeat. From version V6.5.0p5 and up DDSI and RTNetworking services write a heartbeat before forwarding any samples so in that case the splice daemon will not write a fake heartbeat but does unregister the fake heartbeat. The unregistering of the fake heartbeat lead to an invalid liveliness state (to high). Which could cause missing of disconnects on application level Solution: Unregistering of the fake heartbeat is now also conditional |
OSPL-9652 |
When an asymmetric disconnect occurs while there are pending
sample requests, alignment may not progress. Asymmetric disconnects may occur at any time. In particular, it is possible that an asymmetric disconnect occurs while a durability service has outstanding sample requests to another durability service. In case a durability service (say A) has lost connection with another durability service (say B) but not vice versa, and B is waiting for answers to outstanding sample requests from A, then B might not receive an answer because A has lost connection to B. This may stall alignment. In such situation B must cancel the pending sample requests to A, and re-initiate alignment as soon as bidirectional communication is established again. Solution: When an asymmetrical disconnect is detected pending sample requests to the asymmetrically disconnected node are cancelled. |
OSPL-9676 |
Creating (enabling) a data writer in a non-coherent publisher
between begin/end coherent set sets transactionId on writer Creating (enabling) a data writer in a non-coherent publishe between begin/end coherent set copies the next sequence number (1) to the transactionId on the writer, which then causes all published data to be treated as-if part of a transaction that will never be committed. The data will be delivered because the only matching readers are non-coherent, but transient data will be retained in the transaction administration in the group. Solution: A transactionId is no longer set when creating a writer in a non-coherent publisher. |
OSPL-9677 |
The durability service does not parse the
//OpenSplice/Domain/GeneralWatchDog and
//OpenSplice/DurabilityService/Watchdog configuration items
correctly. Watchdog threads are used to track progress of services. Users can specify the properties of the watchdog thread using the generic //OpenSplice/Domain/GeneralWatchDog setting. Individual service like the durability service service can override these values by specifying //OpenSplice/DurabilityService/Watchdog. The durability service did not parse these settings correctly, causing the specified values to be non-effective. Solution: The parsing has been changed so that changes to these settings now take effect. |
OSPL-9701 |
The auto_dispose setting of a DataWriter is not always processed
atomically. When the DataWriter uses and auto_dispose_unregistered_instances setting of TRUE, the reader side would not expect to see an instance state of this Writer to ever become NO_WRITERS. However, because the auto_dispose setting is not always handled atomically, in certain scenario's we might end up with a reader instance whose instance state is NO_WRITERS nevertheless. An example of such a scenario is when an autodisposing DataWriter explicitly disposes and then unregisters an instance. If a late joining node requests alignment between the dispose and the unregister, then the unregister message might be received first through the normal network connection, while the older dispose message arrives later as part of the alignment process. That means the instance state will first go to NO_WRITERS, and the older DISPOSE message can no longer undo this. Solution: An unregister message from an autodisposing writer will now always be treated as a combined DISPOSE and UNREGISTER message. That means that is the unregister arives before the explicit dispose, it will still set the instance to DISPOSE and not to NO_WRITERS. |
Report ID. | Description |
---|---|
OSPL-9085 / 16827 |
Race condition between multiple simultaneous started OpenSplice daemons with the same domain configuration. Starting multiple OpenSplice daemons in shared memory mode for the same domain configuration at the same time could lead to a crash. Solution: The race condition between the multiple OpenSplice daemons is fixed and a consecutive started daemon will be detected correctly and exit properly. |
OSPL-9672 |
Java5 Helloworld example creates topic with wrong QoS. The Java5 Helloworld examples creates a topic with the default QoS. This differs from all other language binding HelloWorld examples. As a result of this an error message will be reported when this example wants to communicate with other language binding HellowWorld examples. Solution: The defect is fixed and the Java5 example now creates a topic with the correct QoS so it can also communicate with the HellowWorld examples from other languages. |
OSPL-9684 |
A failing networking service does not always restart with FailureAction configuration restart. When a networking service detects a failure, it'll terminate as gracefully as possible. The splice daemon was not able to detect whether the networking service terminated due to a valid stop or due to a detected failure. This means that the splice daemon did not restart the networking service when it failed gracefully. Solution: Set the networking service state to died when it failed gracefully. |
OSPL-9738 |
When the durability service terminates a mutex is not cleaned up properly and potentially causes a memory leak. The durability services uses various mutexes to protect access to shared resources by different threads. One such mutex is used to protect updates to a sequence number. This mutex is not cleaned up properly. Solution: The mutex is now cleaned up properly. |
OSPL-9797 / 17608 |
DataWriter deletion is not always reflected in the instance_states of the DataReader. When two nodes have unaligned clocks and are using BY_RECEPTION_TIMESTAMP destination ordering, and the clock of the sending node runs ahead of the clock on the receiving node, then the deletion of the DataWriter on the sending node may not always correctly be reflected in the instance_states of the DataReader. These instance_states should go to either NOT_ALIVE_DISPOSED or NOT_ALIVE_NO_WRITERS, depending on the auto_dispose_unregistered_instances setting of the DataWriter. However, when the clock skew is bigger than the lifespan of an instance, its instance_state might remain ALIVE after deletion of the DataWriter. Solution: The algorithm now always applies the correct instance_state in the above mentioned case. |
OSPL-6636 / 17203 |
In the Isocpp2 API a deadlock may occur when a listener is removed when a listener callback is in progress. When using the Isocpp2 API a deadlock may occur when removing a listener. Before a listener callback is called a mutex lock is taken to prevent that the listener can be destoyed when the listener callback is active. When in the listener callback an Isocpp2 operation on the corresponding entity is called the associated entity mutex lock may be taken. When at the same time the entity is being closed from another thread a deadlock may occur because both involved mutex locks are performed in different order. Solution: For all listener operations a separate mutex is used to prevent the listener to be removed when a listener callback is active. This lock is used outside of the normal entity lock which ensures that no deadlock can occur. |
OSPL-9675 / 17534 |
Unnecessary error trace in the classic C++ API. An error will be traced when getting the listener of a Subscriber, Publisher, DataWriter or DataReader when no listener was set. This isn't really an error and the traces are unnecessary and even unwanted. This error is also traced when calling Subscriber::notify_datareaders(). Solution: The error traces are removed. |
OSPL-9730 / 17556 |
Out of sync dispose_all_data. The dispose_all_data is an asynchronous C&M operation that is not part of a coherent update. It is advised to use BY_SOURCE_TIMESTAMP as destination_order when using dispose_all_data to negate the asynchronous behaviour of it. However, the dispose_all_data still uses the BY_RECEPTION_TIMESTAMP of the built-in C&M operation. This introduces a small timing issue when another node executes the dispose_all_data and immediately writes a new sample. If the write overtakes the dispose_all_data, it is possible that the dispose_all_data will dispose of that sample, while in fact the sample is newer than the dispose and should be retained. Solution: The dispose_all_data handling now uses the source timestamp in stead of the reception timestamp regardless of the built-in topic destination_order. |
OSPL-9823 |
C# API reports "Detected invalid handle" in error log during onDataAvailable callback. When the C# API performs an onDataAvailable callback on a DataReaderListener, it also logs a message in the error log that states that an invalid handle was detected. The error reported is harmless and totally irrelevant, and your application will continue to function properly. Solution: An onDataAvailable callback will no longer cause this error to show up in the log. |
Report ID. | Description |
---|---|
OSPL-9430 / 17158 |
The isocpp2 API does not report an error when the generated
discriminant setter related to a union is used incorrectly. For a union type the idl preprocessor generates an setter for the discriminant of the union. This is the _d(val) function. However, this setter may only be used to set the discriminant to a label that corresponds to the current state of the union. A union case may have more than one label associated with an certain case. With the use of the _d(val) function it is only allowed to set the discriminant to one of the alternative labels associated with the current selected case. However when this setter is used incorrectly the isocpp2 API does not raise an exception and it may cause a crash of the application. Solution: The generated discriminant setter function _d(val) checks if the specified value corresponds with the current state of the union and raise an exception otherwise. |
OSPL-9444 |
Move iShapes example from isocpp to isocpp2. The iShapes example should work on top of the isocpp2 i.s.o. the deprecated isocpp API. Solution: The example has been ported to the isocpp2 API |
Report ID. | Description |
---|---|
OSPL-8405 / 16519 |
The discovery channel of the networking service may miss
disconnects when it is configured for a very small detection period. When the discovery channel of the networking service is configured to detect the death (disconnect) of an node within a few 100 msec, then it may occur that the disconnect is not detected. In that case the discovery channel may not detect the disconnect because of an incorrect scheduling of the time the evaluation of the discovery heartbeats occurs. This may cause that a reliable channel becomes stalled when a reconnect of a node occurs which is not detected. Solution: The scheduling of the heartbeat evaluation is moved to the receive thread of the discovery channel to make it independent of the scheduling of the main networking thread. Furthermore, the heartbeat evaluation time is made more strictly related to the maximum interval in which a heartbeat should have been received from a node. |
OSPL-9027 |
idlpp is not robust to paths with whitespace. The idlpp tool is not able to handle paths that contain whitespaces when compiling for cpp and thus using cppgen. Solution: Add quotes to the arguments for cppgen, which are generated by idlpp. |
OSPL-9030 |
OSPL_URI not set correctly in console run. The OSPL_URI can be set using the Launcher Settings dialog. The user can select a file using the browse dialog by clicking the "..." button. The file path must be prepended with "file://". When using the browse dialog, the "file://" was not being included. Solution: When using the OSPL_URI browse dialog, the "file://" is now prepended to selected file path. |
OSPL-9363 / 17024 OSPL-9075 / 16821 |
Possible spliced deadlock when other services crash. When another service or application crashes and leaves the kernel in an undefined state, it can happen that the splice daemon will deadlock and will never shutdown. Solution: Added a thread watchdog to spliced and abort when the shutdown-thread deadlocks. |
OSPL-9389 |
Potential crash when removing certain entities concurrently in
classic java PSMs When a datawriter or datareader is removed by one thread, while a different thread is removing the corresponding subscriber or publisher, in Classic Java PSMs (SAJ and CJ), the application can crash. Solution: The issue was resolved by changing the locking strategy so that a publisher/subscriber cannot be removed when one of its datawriters or datareaders is in use by a different thread. |
OSPL-9491 |
Terminating Streams example causes invalid memory free When the Streams subscriber or publisher example is terminated before it completes, i.e. by pressing Ctrl-C, it can trigger an invalid free of the partition name. Solution: The issue is resolved by using a copy of the partition string that can be freed under all circumstances. |
OSPL-9506 |
RnR service crash when replaying DCPSTopic The RnR service can crash when replaying DCPSTopics. This is due to an improper free. Solution: The improper free in the RnR service is fixed. |
OSPL-9540 / 17190 |
The networking service exits with a fatal error when a
best-effort channel runs out of buffer space. When the situation occurs that all the defragmentation buffers for a best-effort channel are in use and a new buffer is needed then the networking service will first reclaim fragment buffers from messages which are not yet complete (fragments missing) or from the list of buffers waiting to be defragmented. When it fails to free a buffer then the networking service will terminate with a fatal error indicating that it has run out of defragmentation buffers. Solution: When the situation occurs that a best-effort channel runs out of defragmentation buffers and is not able to reclaim a buffer then it will stop reading from the socket until buffers become available again. Note that this may cause that fragments will be lost when the maximum receive buffer size of the socket is exceeded. For a best-effort channel this is allowed. |
OSPL-9551 |
Some buttons on Tuner's Writer Pane are connected to wrong Writer
functions Some Buttons on the Tuner's Writer Pane are connected to the wrong Writer functions:
Solution: The handlers for the various buttons have been corrected. |
OSPL-9552 / 17191 |
The networking service should log the selected network
interface. The networking service does not report the network interface that is has selected. To support the analyses of problems on hosts which have more than one network interface configured it the networking service should report the selected network interface. Solution: On startup of the networking service the selected network interface is reported in the info log. The name of the interface is also included in the report which indicates that networking has detected a state change of the network interface (down/up). |
OSPL-9608 |
Samples can be purged prematurely Purging samples depends (among other things) on the service_cleanup_delay. If opensplice is started when the up time of the node is smaller than the service_cleanup_delay, then disposed samples will be purged permaturely. Solution: Improve check between node up time and the service_cleanup_delay. |
TSTTOOL-437 / 17181 |
OpenSplice Tester crash When running tester scripts in headless mode, ArrayIndexOutOfBoundsExceptions are sometimes thrown and logged to the console. The exceptions were the result of a race condition on startup. Solution: A fix was made in the MainWindow to prevent the race condition when populating filtered topics. |
Report ID. | Description |
---|---|
OSPL-9047 / 16793 |
Deadline-missed events are not necessarily triggered per instance When multiple deadlines are missed around the same time and a listener is used to monitor these events, only a single listener callback may be performed. Solution: A separate listener callback is performed for each missed deadline. |
RNR-704 / 17155 |
RnR CDR recording cannot be imported RnR is not properly processing the situation where there is no active union case. Solution: RnR now accepts the option of not having an active union case. |
Report ID. | Description |
---|---|
OSPL-7942 OSPL-9207 / 16854 |
Thread specific memory leaking away after thread end Threads that are created from a user application don't use the ospl internal thread wrapper. Thread specific memory allocated by the ospl software stack isn't freed when these threads exit. Solution: Use OS supplied callback functionality to call a destructor function when a user created thread exits to free allocated memory. |
OSPL-8425 / 16551 |
When a lot of fragmented, best effort data is received, the
receiver will run out of buffer space Defragmentation buffers is shared with the receive buffer. In case a lot of data is received, the networkstack isn't able to free claimed buffer space as it doesn't get to time to defragment data in the buffers. As a result, incoming data can not be stored and networking can't recover from this situation as these buffers remain locked. Solution: In case no buffers can be freed, drop data in buffers to create free space to be able to receive new data and continue to function. Data in the dropped buffers is lost but as this is best-effort data, this is allowed. |
OSPL-8813 |
Calling RMI C++ CRuntime::stop() can cause deadlocks When RMI C++ CRuntime::stop() is called, it can deadlock due to timing. Two threads will wait until the other is stopped. Solution: By caching certain information, it's not necessary for one thread to wait for the other while stopping. |
OSPL-8958 |
DDSI can regurgitate old T-L samples for instances that have
already been unregistered DDSI maintains a writer history cache for providing historical data for transient-local writers and for providing reliability. An instance is removed from this cache when it is unregistered by the writer, but its samples are retained until they have been acknowledged by all (reliable) readers. Already acknolwedged samples that were retained because they were historical data could survive even when the instance was removed. When this happened, a late-joining reader would see some old samples reappear. Solution: deleting an instance now also removes the already acknowledged samples from the history. |
OSPL-9058 / 16796 OSPL-9206 / 16852 |
Incompatibility with versions before V6.5.0p5 An internal change to builtin heartbeat topic caused an incompatibility with older versions. When adding a node running a recent version of OpenSplice to a domain with nodes running a version before V6.5.0p5, the existing nodes would incorrectly dispose participants (and corresponding entities) belonging to the new nodes after a single heartbeat period, normally done only when a heartbeat expires. Solution: To resolve this, the change to the heartbeat topic was reverted. |
OSPL-9059 / 16803 |
Custom Lib missing for ISOCPP2 on DDSCE-P708-V6x-MV-A Custom LIb missing for ISOCPP2 on DDSCE-P708-V6x-MV-A target platform. Solution: Custom lib got added for this platform too. |
OSPL-9113 |
When the persistent store contains an unfinished transaction a
non-coherent reader may not receive the corresponding historical
data. When the durability service injects persistent data and the persistent data set contains unfinished transactions and there is a non-coherent reader present then this reader will not receive the data of these unfinished transactions. For persistent data the durability service unregisters each instance after injecting. However, the injection the historical samples of a transaction expect that the instance is still registered. Solution: When retrieving historical data and the historical data contains unfinished transaction then the corresponding samples are injected into the non-coherent reader independent from the existence of a registration for the corresponding instance. |
OSPL-9208 |
DDSI not sending an SPDP ping at least every SPDPInterval DDSI has a Discovery/SPDPInterval setting that is meant to set an upper bound to the SPDP ping interval, that is otherwise derived from the lease duration set in the //OpenSplice/Domain/Lease/ExpiryTime setting. The limiting only occurred when the lease duration is > 10s. Solution: The limiting has been changed to ensure the interval never becomes larger than what is configured. |
OSPL-9216 |
Calling RMI C++ CRuntime::run() can cause deadlocks When RMI C++ CRuntime::run() is called when the runtime is already running, then the running state is detected and that call will leave the runtime immediately. However, it does so without unlocking a mutex. Further interaction with that runtime is likely to hit that locked mutex and thus deadlock. Solution: Unlock the runtime when a problem is detected during the run() call. |
OSPL-9240 / 16923 |
Memory leak in entities with listener The listener administration shared by all language bindings, leaks a small amount of heap memory each time an entity is removed that has a listener attached to it. Solution: The issue was resolved by properly freeing admin data when a listener is detached from an entity manually or when the entity is removed. |
OSPL-9248 |
Error while building FACE C++ example on windows 64 bit When building the FACE C++ example on windows 64 bit an error LNK2001: unresolved external symbol coming from the DataState class could occur. Solution: The defect in the DataState class has been fixed and the error will not occur anymore. |
OSPL-9250 / 16928 |
RMI Java runtime stop can cause NullPointerException. By external interference of the RMI internals or specific timing, it is possible that RMI Java runtime stop() can throw a NullPointerException. Solution: Added various null pointer checks. |
OSPL-9291 |
Possibly wrong durability master confirmation due to not waiting
heartbeat expiry period. When a reconnect occurred it was possible that durability services confirmed a federation as master while the one they confirmed themselves confirmed a different federation as master. This would lead to durable data not being aligned until a new durability conflict is triggered. The cause of this problem was that the durability service should wait the heartbeat expiry period before confirming a master, however the wait did not occur. Solution: During master selection the wait heartbeat expiry period now works again. |
OSPL-9361 / 17014 |
Incorrect IDL code generation for IDL that contains an array of a
typedef which refers to a string type. For an IDL definition that contains an array of a typedef and the typedef refers to a string then the copy-in routines generated by the IDL preprocessor (idlpp) is incorrect. The string members are copied to a wrong memory location. Solution: The code that is generated by the IDL preprocessor for this array type is corrected. |
OSPL-9388 / 17030 |
Durability service might deadlock when the networking queue is flooded. When the network queue is overrun by the durability service, the normal mode of operation is to sleep a bit and retry again later. However, there is a slight chance that the sending thread of the network service that needs to make room again by consuming elements in the queue will indirectly block on the sleeping thread in the durability service itself. Solution: The network service can no longer indirectly run into a lock that is held by the durability service while the network queue is flooded. |
OSPL-9413 |
Memory leak in RnR service when using XML storage. The RnR service leaks memory for every data update when using XML storage. Solution: Free internal RnR service variable. |
OSPL-9441 |
IsoCPP dds::sub::CoherentAccess destruction will throw an exception when end() was called on that object. The dds::sub::CoherentAccess destructor will end a coherent access. When the coherent access was already ended by a call to dds::sub::CoherentAccess::end() prior to the destruction of that object, the destructor will throw an precondition_no_met exception when it tries to end the already ended coherent access. An exception in a destructor can cause undefined behaviour (like a crash on VS14). Solution: Check if the coherent access is already ended before ending it again in the destructor. |
TSTTOOL-434 |
Participant reader and writer tables - QoS column display and tooltip incorrect In the Browser tab, if the user navigated to a participant, the corresponding reader and writer tables did not display the qos values correctly. Solution: This regression was introduced in 6.7.0 and has been fixed in 6.7.1. |
Report ID. | Description |
---|---|
OSPL-8709 / 16660 |
Problem with triggered read condition when order access is used When using order access there is a possibility that a reader continues to receive data available events, but reading or taking results in no data available. This is caused by a disposed not being purged, causing invalid samples to retain in the data reader. Solution: When taking data, purge invalid samples in stead of retaining them in the reader. |
OSPL-8950 / 16756 |
Memory increase of in-active RMI proxies An RMI proxy will receive replies (of the related service) meant for other proxies on other nodes. The middleware takes care of filtering the proper replies for the proxies on the local node. However, when the RMI proxy is in-active (it is created but not used to send requests), the replies are not cleared from the proxy reply reader memory. When the RMI proxy becomes active (sending a new request) all replies are taken from the reader and the memory is back to normal. However, the memory usage can increase on a node when the proxy is in-active. Solution: The proxy reply reader is always monitored and unrelated replies are removed from its memory. |
OSPL-8438 / 16569 |
Builtin types not properly supported in ISOCPP2 When using any of the builtin types (types defined in dds_dcps.idl or defined in its included files) in your own IDL file (properly including the relevant IDL file in which it is defined), then the copy functions generated by idlpp would have compilation issues where referenced functions could not be found. Solution: The previously missing referenced functions have now been added to the ISOCPP2 library. |
OSPL-9122 |
Error report during termination when LivelinessLost listener callback was set When using a LivelinessLost listener callback in your application it could happen that during termination of the application an error report "Internal error DataWriter has no Publisher reference" occurs in the ospl-error.log file. Solution: The defect is fixed and the error will not occur anymore. |
OSPL-9015 |
Deletion of a reader after all group coherent transactions were received and read could lead to new transactions never becoming complete. When deleting a group coherent reader, which was part of a previous complete and read group transaction, it was possible that other readers in that group coherent subscriber would not be able to read a newly send transaction. The new transaction was never marked complete since it still expected data for the deleted reader. Solution: Updated the coherent administration so that the reader is not longer part of future group transactions. |
OSPL-8900 |
Tuner - When writing data with the standalone writer frame, it is not possible to edit collections. Normally, in the reader-writer frame, right-clicking on a sequence user data field brings up a context menu to Add Details to the collection. The same action is missing from the standalone writer frame. Solution: The writer frame's table now has the ability to edit collections in the same manner as the reader-writer frame does. |
OSPL-8577 |
Tuner - When writing data with the standalone writer frame, the data model is not properly initialized. When writing data in the standalone writer frame, accessed via the context menu option "Write data" on any Writer tree entity, clicking the Write button without first editing any of the default table values results in write error. Solution: The table's backing DDS data object is properly initialized with the table's values when a writer action is called. |
OSPL-8507 |
Transactions with explicit registration messages never become complete. When doing begin_coherent_changes then register an instance and finally call end_coherent_changes the total transaction could fail because the register_instance is not always sent. Solution: The defect in the transaction mechanism is solved and registration messages are not properly sent. |
OSPL-8930 |
The lease manager has to support leases which use the monotonic clock and leases which use the elapsed time clock. The lease manager is used to handle events that have to be processed at a certain time. For example the monitor the liveliness of the services the lease manager has to evaluate the corresponding leases at regular intervals. These leases are using the monotonic clock to support hibernation. However the lease manager is also used to monitor the liveliness of instances or when the deadline QoS is set to monitor if the deadline settings are satisfied. For these leases the elapsed time clock is used. Internally the lease manager used the monotonic clock. This may cause problems for leases which use the elapsed time clock when hibernation occurs. Solution: The defect in the transaction mechanism is solved and registration messages are not properly sent. |
OSPL-9171 |
A Condition detach from a Waitset can block. If a thread is waiting on a waitset and another thread detaches a condition from that waitset, the waitset is triggered after which the detach can take place. However, if the first thread is very quick (or the detaching thread is slow), it can happen that the first thread already enters the wait again while the detach didn't detect the trigger yet. When that happens, the detach will block (at least) until the waitset is triggered again. Solution: Wait for a detach to finish when entering a waitset wait. |
OSPL-7980 |
DDSI2 retransmits full sample on a retransmit request for the
sample, even if the sample is huge The DDSI reliable protocol offers two different ways of requesting a retransmit of some data: a sample retransmit request and, for fragmented data (i.e., large samples), a fragment retransmit request. DDSI2 would always retransmit the full sample upon receiving a sample retransmit, even if that sample is huge, instead of retransmitting a "reasonable" amount and relying on further fragment retransmit requests. Solution: The DDSI2 service now retransmits a limited amount of data when receiving a retransmit request for a full sample. |
OSPL-8017 |
DDSI2 did not renew a participant lease for every received message The DDSI2 service discovers remote participants and automatically deletes them if they do not renew their leases in time. The lease renewal was tied to reception of data and of explicit lease renewal messages, and hence reception of, e.g., an acknowledgement would not lead to a lease renewal, even though it obviously requires the remote participant to be alive. Solution: DDSI2 now renews leases regardless of the type of message. |
OSPL-8636 |
Streams throughput example hang on windows When using the streams throughput example on windows the subscriber application could hang on termination. Solution: The defect is fixed and the subscriber will not hang anymore. |
OSPL-8660 |
RMI cpp does not reduce thread pool size immediately. The threadPoolSize of the RMI cpp ServerThreadingPolicy can be reduced. The threads within the threadpool are quit after they handled a request task when the number of threads are larger then the required threadPoolSize. Because the request tasks are handled before the number of threads are reduced, it is possible for the Server to get more parallel calls then expected. Solution: Quit a thread within the threadpool until the number of threads are equal to the threadPoolSize. Then continue handling the request tasks. |
OSPL-8739 / 16698 |
Too generic c_insert symbol The c_insert symbol is exported in one of the OpenSplice libraries. This is too generic and causes clashes with other software packages. Solution: The c_insert function has been renamed by prefixing it with the ospl_ prefix. |
OSPL-8828 |
Possible bufferoverflow when using Google Protocol Buffers on Windows When using Google Protocol Buffers on Windows with Isocpp it can happen that the application crashes with a buffer overflow. This is due to a defect in the translation from gpb to dds datatypes. Solution: The defect is fixed and the overflow will not occur anymore. |
OSPL-9097 |
DDSI transmit path can lock up on packet loss to one node while another node has crashed A successful retransmit to one remote reader while another remote reader that has not yet acknowledged all samples disappears (whether because of a loss of connectivity or a crash), and when all other remote readers have acknowledged all samples, and while the writer has reached the maximum amount of unacknowledged data would cause the transmit path in DDSI to lock up because the writer could then only be unblocked by the receipt of an acknowledgement message that covers a previously unacknowledged sample, which under these circumstances will not come because of the limit on the amount of unacknowledged data. Solution: deleting a reader now not only drops all unacknowledged data but also clears the retransmit indicator of the writer. |
OSPL-9096 |
Durability service DIED message even though the durability service is still running The d_status topic is published periodically by the durability service to inform its fellows of its status. By using a KEEP_ALL policy, the thread writing the status message and renewing the serivce lease could be blocked by a flow-control issue on the network, which could cause the durability service to be considered dead by the splice daemon when in fact there was no problem with the durability service. Solution: use a KEEP_LAST 1 history QoS policy for the writer. |
OSPL-9067 |
Large topics are sent published but not received Loss of the initial transmission of the final fragments of a large sample failed to cause retransmit requests for those fragments until new data was published by the same writer. Solution: ensure the receiving side will also request retransmission of those fragments based on heartbeats advertising the existence of the sample without giving specifics on the number of fragments. |
OSPL-9077 / 00016820 |
Potential crash in durability service during CATCHUP policy The durability service could crash while processing a CATCHUP event. This crash was caused by the garbage collector purging old instances while the CATCHUP policy was walking through the list of instances to do some bookkeeping. Solution: The CATCHUP policy now creates a private copy of the instance list while the garbage collector is unable to make a sweep. This private list is then used to do the bookkeeping. |
OSPL-9068 / 00016813 |
Catchup policy may leak away some instances When a node that performs a catchup to the master contains an instance that the master has already purged, then the node catching up would need to purge this instance as well. It would need to do this by re-registering the instance, inserting a dispose message and then unregistering this instance again. However, the unregister step was missing, causing the instance to effectively leak away since an instance is only purged by the durability service when it is both disposed AND unregistered. Solution: The durability will now both dispose AND unregister the instance at the same time. |
OSPL-9081 / 00016824 |
Potential deadlock in the OpenSplice kernel The OpenSplice kernel has a potential deadlock where two different code paths may claim locks in the opposite order. The deadlock occurs when one thread is reading/taking the data out of a DataReader while the participant's listener thread is processing the creation of a new group (i.e. a unique partition/topic combination) to which this Reader's Subscriber is also attached. Solution: The locking algorithm has been modified in such a way that the participant's listener thread no longer requires to hold both locks at the same time. |
OSPL-9064 / 00016808 |
Changes caused by OSPL-8914 in 6.6.3p4f4 have been reverted OSPL-8914 in the 6.6.3p4f4 release has made several changes to the durability service in order to solve problems where a rapid disconnect/reconnect cycle would leave the durability service in an undefined state. In these situations, a disconnect had not yet been fully processed when the reconnect occurs. However, the solutions provided in 6.6.4 caused other, previously non-existing errors during normal operation. Solution: All changes made as part of OSPL-8914 in the 6.6.4 release have been reverted. As an alternative solution to rapid disconnect/reconnect cycle issues, ddsi has offered temporary blacklisting of recently disconnected participants (see OSPL-8956). |
OSPL-8956 |
Temporary blacklisting of remote participants in DDSI2 The DDSI2 service now provides an option to temporarily block rediscovery of proxy participants. Blocking rediscovery gives the remaining processes on the node extra time to clean up. It is strongly advised that applications are written in such a way that they can handle reconnects at any time, but when issues are found, this feature can reduce the symptoms. Solution: A new setting in the DDSI section of the configuration has been added: Internal/RediscoveryBlacklistDuration along with an attribute Internal/RediscoveryBlacklistDuration [@enforce]. The former sets the duration (by default 10s), the second whether to really wait out the full period (true), or to allow reconnections once DDSI2 has internally completed cleaning up (false, the default). It strongly discouraged to set the duration to less than 1s. |
OSPL-9071 |
v_groupFlushAction passes a parameter that is not fully initialized. Valgrind reported that the v_groupFlushAction function passes a parameter that is not fully initialized. Although one of these parameters was evaluated in a subsequent function invocation, it never caused issues because the value was only used as an operand for a logical AND where the other operand was always FALSE. Solution: All attributes of the parameter in question are now explicitly initialized. |
OSPL-9055 |
Potential Sample drop during delivery to a local Reader In some cases, a dispose followed by an unregister does not result in NOT_ALIVE_DISPOSED state on a Reader residing on the same node as the Publisher. In those cases, the Reader has an end state set to NOT_ALIVE_NO_WRITERS, and reports that a sample has been Lost. Solution: We have no clue what could cause this behaviour, but added some logging to capture the context of the erroneous sample drop. This is just a temporary measure, and will be reverted when the root cause has been found and fixed. |
OSPL-9056 |
Potential deadlock during early abort of an application When an application aborts so quickly that the participant's leaseManager thread and its resendManager thread have not yet had the opportunity to get started, then the exit handler will block indefinitely waiting for these threads to exit the kernel. However, both threads are already blocked waiting to access a kernel that is already in lockdown. Solution: The constructor of the participant will not return before both the leaseManager and resendManager threads have entered the kernel successfully. |
OSPL-8953 |
Potential deadlock between reader creation and durability notification A thread that creates a new DataReader and a thread from the durability service that notifies a DataReader when it has completed its historical data alignment grab two of their locks in reverse order, causing a potential deadlock to occur. Solution: The locking algorithm has been modified so that these two threads do no longer grab both locks in reverse order. |
OSPL-8886 |
Durability failure to merge data after a short disconnect When the disconnection period is shorter than twice the heartbeat a durability service may not have been able to determine a new master before the node is reconnected again. In that case no master conflict is generated. In case the durability service is "late" in confirming a master it might even occur that the master has updated its namespace, but the namespace update is discarded because no confirmed master has been selected yet. As a consequence no request will for data will be sent to the master, and the durability service will not be aligned. Solution: In case a durability service receives a namespace update for a namespace for which no confirmed master is selected yet, the update is rescheduled for evaluation at a later time instead of discarding the update. |
OSPL-8914 |
Durability failure to merge data after a short disconnect When a node becomes disconnected it may loose its master. As a result the node will look for a new master. In doing so, the node would first unconfirm its current master and then wait for other fellows to propose a new master. The time to look for a new master is specified in the configuration file (DurabilityService.Network.Heartbeat.ExpiryTime). When the disconnection was shorter than the DurabilityService.Network.Heartbeat.ExpiryTime, no merge is triggered. Solution: whenever a node is discovered that is not simply starting and it has no confirmed master, a merge is triggered, just like when there are conflicting masters. NOTE: This change has been reverted in 6.6.3p4f7. |
OSPL-8948 / 16755 OSPL-8987 |
Race condition between durability data injection and garbage collecting of empty instances The durability service cached instance handles when injecting a historical data set in a way that could result in the historical samples being thrown away if the instance was empty and no known writers had registered it. Solution: the instance handle is no longer cached.. |
OSPL-8971 |
Catchup policy may incorrectly mark unregistered instances as
disposed. When an instance is unregistered on the master node during a disconnect from another node that has specified a CATCHUP policy with that master, then upon a reconnect that unregister message will still be delivered to that formerly disconnected node. However, the reconnected node will dispose all instances for which it did not receive any valid data, so if the unregister message it the only message received for a particular instance, then its instance will be disposed. Solution:The Catchup policy is now instructed to dispose instances for which it did not receive any valid data OR for which it did not receive any unregister message. |
OSPL-8984 |
DDSI handling of non-responsive readers needs improvement When a writer is blocked for ResponsiveTimeout seconds, DDSI will declare the matching proxy readers that have not yet acknowledged all data "non-responsive" and continue with those readers downgraded to best-effort. This prevents blocking outgoing traffic indefinitely, but at the cost of breaking reliability. For historical reasons it was set to 1s to limit the damage a non-responsive reader could cause, but past improvements to the handling of built-in data in combination with DDSI (such as fully relying on DDSI discovery for deriving built-in topics) mean there is no longer a need to have such an aggressive setting by default. Solution: The default behaviour has been changed to never declare a reader non-responsive and maintain reliability also when a remote reader is not able to make progress. The changes also eliminate some spurious warning and error messages in the log files that could occur with a longer timeout. |
OSPL-8920 |
DDSI2 Crash Version 6.6.3p4 introduced a fix for OSPL-8872, taking the sequence number most recently transmitted by a writer when it matched reader into account to force heartbeats out until all historical data has been acknowledged by the reader. The change also allowed a flag forcing the transmission of heartbeats informing readers of the availability of data to be set earlier than before in the case where the writer had not published anything yet at the time the reader was discovered. While logically correct, this broke the determination of the unique reader that had not yet acknowledged all data in cases where there is such a unique reader. This in turn could lead to a crash. Solution: the aforementioned flag is once again never set before a sample has been acknowledged. |
Report ID. | Description |
---|---|
OSPL-8425 / 16551 |
When a lot of fragmented, best effort data is received, the
receiver will run out of buffer space Defragmentation buffers is shared with the receive buffer. In case a lot of data is received, the networkstack isn't able to free claimed buffer space as it doesn't get to time to defragmentate data in the buffers. As a result, incoming data can not be stored and networking can't recover from this situation as these buffers remain locked. Solution: In case no buffers can be freed, drop data in buffers to create free space to be able to receive new data and continue to function. Data in the dropped buffers are lost but as this is best-effort data, this is allowed. |
OSPL-8886 |
Durability failure to merge data after a short disconnect When the disconnection period is shorter than twice the heartbeat a durability service may not have been able to determine a new master before the node is reconnected again. In that case no master conflict is generated. In case the durability service is "late" in confirming a master it might even occur that the master has updated its namespace, but the namespace update is discarded because no confirmed master has been selected yet. As a consequence no request will for data will be sent to the master, and the durability service will not be aligned. Solution: In case a durability service receives a namespace update for a namespace for which no confirmed master is selected yet, the update is rescheduled for evaluation at a later time instead of discarding the update. |
OSPL-8914 |
Durability failure to merge data after a short disconnect When a node becomes disconnected it may loose its master. As a result the node will look for a new master. In doing so, the node would first unconfirm its current master and then wait for other fellows to propose a new master. The time to look for a new master is specified in the configuration file (DurabilityService.Network.Heartbeat.ExpiryTime). When the disconnection was shorter than the DurabilityService.Network.Heartbeat.ExpiryTime, no merge is triggered. Solution: whenever a node is discovered that is not simply starting and it has no confirmed master, a merge is triggered, just like when there are conflicting masters. NOTE: This change has been reverted in 6.6.3p4f7. |
OSPL-8920 |
DDSI2 Crash Version 6.6.3p4 introduced a fix for OSPL-8872, taking the sequence number most recently transmitted by a writer when it matched reader into account to force heartbeats out until all historical data has been acknowledged by the reader. The change also allowed a flag forcing the transmission of heartbeats informing readers of the availability of data to be set earlier than before in the case where the writer had not published anything yet at the time the reader was discovered. While logically correct, this broke the determination of the unique reader that had not yet acknowledged all data in cases where there is such a unique reader. This in turn could lead to a crash. Solution: the aforementioned flag is once again never set before a sample has been acknowledged. |
OSPL-8928 |
Improve FACE documentation FACE documentation is very rudimentary. Solution: Getting started, API documentation, configuration documentation and example have been added. |
OSPL-8948 / 16755 OSPL-8987 |
Race condition between durability data injection and garbage collecting of empty instances The durability service cached instance handles when injecting a historical data set in a way that could result in the historical samples being thrown away if the instance was empty and no known writers had registered it. Solution: the instance handle is no longer cached.. |
OSPL-8953 |
Potential deadlock between reader creation and durability notification A thread that creates a new DataReader and a thread from the durability service that notifies a DataReader when it has completed its historical data alignment grab two of their locks in reverse order, causing a potential deadlock to occur. Solution: The locking algorithm has been modified so that these two threads do no longer grab both locks in reverse order. |
OSPL-8956 |
Temporary blacklisting of remote participants in DDSI2 The DDSI2 service now provides an option to temporarily block rediscovery of proxy participants. Blocking rediscovery gives the remaining processes on the node extra time to clean up. It is strongly advised that applications are written in such a way that they can handle reconnects at any time, but when issues are found, this feature can reduce the symptoms. Solution: A new setting in the DDSI section of the configuration has been added: Internal/RediscoveryBlacklistDuration along with an attribute Internal/RediscoveryBlacklistDuration [@enforce]. The former sets the duration (by default 10s), the second whether to really wait out the full period (true), or to allow reconnections once DDSI2 has internally completed cleaning up (false, the default). It strongly discouraged to set the duration to less than 1s. |
OSPL-8971 |
Catchup policy may incorrectly mark unregistered instances as
disposed. When an instance is unregistered on the master node during a disconnect from another node that has specified a CATCHUP policy with that master, then upon a reconnect that unregister message will still be delivered to that formerly disconnected node. However, the reconnected node will dispose all instances for which it did not receive any valid data, so if the unregister message it the only message received for a particular instance, then its instance will be disposed. Solution:The Catchup policy is now instructed to dispose instances for which it did not receive any valid data OR for which it did not receive any unregister message. |
OSPL-8984 |
DDSI handling of non-responsive readers needs improvement When a writer is blocked for ResponsiveTimeout seconds, DDSI will declare the matching proxy readers that have not yet acknowledged all data "non-responsive" and continue with those readers downgraded to best-effort. This prevents blocking outgoing traffic indefinitely, but at the cost of breaking reliability. For historical reasons it was set to 1s to limit the damage a non-responsive reader could cause, but past improvements to the handling of built-in data in combination with DDSI (such as fully relying on DDSI discovery for deriving built-in topics) mean there is no longer a need to have such an aggressive setting by default. Solution: The default behaviour has been changed to never declare a reader non-responsive and maintain reliability also when a remote reader is not able to make progress. The changes also eliminate some spurious warning and error messages in the log files that could occur with a longer timeout. |
OSPL-9027 |
idlpp is not robust to paths with whitespace The idlpp tool is not able to handle paths that contain whitespaces when compiling for cpp and thus using cppgen. Solution: Add quotes to the arguments for cppgen, which are generated by idlpp. |
OSPL-9028 |
Bug in serializing messages with sequences of 64 bit The SAC generic copy routines have an issue with alignment and empty sequences. Additionally, the legacy CDR serialiser (the only one up to but not including 6.4.1) has this issue because it always inserts padding for sequences of primitive and enumerated types even when the sequence is empty. Solution:The SAC copy routines have been modified to be able to deal with empty sequences. Even though the new CDR serialiser (since 6.4.1) does not have this issue and versions from 6.4.1 onward are not affected, the old code is still a compile-time option in the current versions for development purposes and so that one is fixed, too. |
OSPL-9055 |
Potential Sample drop during delivery to a local Reader In some cases, a dispose followed by an unregister does not result in NOT_ALIVE_DISPOSED state on a Reader residing on the same node as the Publisher. In those cases, the Reader has an end state set to NOT_ALIVE_NO_WRITERS, and reports that a sample has been Lost. Solution: We have no clue what could cause this behaviour, but added some logging to capture the context of the erroneous sample drop. This is just a temporary measure, and will be reverted when the root cause has been found and fixed. |
OSPL-9056 |
Potential deadlock during early abort of an application When an application aborts so quickly that the participant's leaseManager thread and its resendManager thread have not yet had the opportunity to get started, then the exit handler will block indefinitely waiting for these threads to exit the kernel. However, both threads are already blocked waiting to access a kernel that is already in lockdown. Solution: The constructor of the participant will not return before both the leaseManager and resendManager threads have entered the kernel successfully. |
OSPL-9058 / 16796 OSPL-9206 / 16852 |
Incompatibility with versions before V6.5.0p5 An internal change to builtin heartbeat topic caused an incompatibility with older versions. When adding a node running a recent version of OpenSplice to a domain with nodes running a version before V6.5.0p5, the existing nodes would incorrectly dispose participants (and corresponding entities) belonging to the new nodes after a single heartbeat period, normally done only when a heartbeat expires. Solution: To resolve this, the change to the heartbeat topic was reverted. |
OSPL-9064 / 16808 |
Changes caused by OSPL-8914 in 6.6.3p4f4 have been reverted OSPL-8914 in the 6.6.3p4f4 release has made several changes to the durability service in order to solve problems where a rapid disconnect/reconnect cycle would leave the durability service in an undefined state. In these situations, a disconnect had not yet been fully processed when the reconnect occurs. However, the solutions provided in 6.6.4 caused other, previously non-existing errors during normal operation. Solution: All changes made as part of OSPL-8914 in the 6.6.4 release have been reverted. As an alternative solution to rapid disconnect/reconnect cycle issues, ddsi has offered temporary blacklisting of recently disconnected participants (see OSPL-8956). |
OSPL-9067 |
Large topics are sent published but not received Loss of the initial transmission of the final fragments of a large sample failed to cause retransmit requests for those fragments until new data was published by the same writer. Solution: ensure the receiving side will also request retransmission of those fragments based on heartbeats advertising the existence of the sample without giving specifics on the number of fragments. |
OSPL-9068 / 00016813 |
Catchup policy may leak away some instances When a node that performs a catchup to the master contains an instance that the master has already purged, then the node catching up would need to purge this instance as well. It would need to do this by re-registering the instance, inserting a dispose message and then unregistering this instance again. However, the unregister step was missing, causing the instance to effectively leak away since an instance is only purged by the durability service when it is both disposed AND unregistered. Solution: The durability will now both dispose AND unregister the instance at the same time. |
OSPL-9071 |
v_groupFlushAction passes a parameter that is not fully initialized. Valgrind reported that the v_groupFlushAction function passes a parameter that is not fully initialized. Although one of these parameters was evaluated in a subsequent function invocation, it never caused issues because the value was only used as an operand for a logical AND where the other operand was always FALSE. Solution: All attributes of the parameter in question are now explicitly initialized. |
OSPL-9077 / 00016820 |
Potential crash in durability service during CATCHUP policy The durability service could crash while processing a CATCHUP event. This crash was caused by the garbage collector purging old instances while the CATCHUP policy was walking through the list of instances to do some bookkeeping. Solution: The CATCHUP policy now creates a private copy of the instance list while the garbage collector is unable to make a sweep. This private list is then used to do the bookkeeping. |
OSPL-9081 / 00016824 |
Potential deadlock in the OpenSplice kernel The OpenSplice kernel has a potential deadlock where two different code paths may claim locks in the opposite order. The deadlock occurs when one thread is reading/taking the data out of a DataReader while the participant's listener thread is processing the creation of a new group (i.e. a unique partition/topic combination) to which this Reader's Subscriber is also attached. Solution: The locking algorithm has been modified in such a way that the participant's listener thread no longer requires to hold both locks at the same time. |
OSPL-9096 |
Durability service DIED message even though the durability service is still running The d_status topic is published periodically by the durability service to inform its fellows of its status. By using a KEEP_ALL policy, the thread writing the status message and renewing the serivce lease could be blocked by a flow-control issue on the network, which could cause the durability service to be considered dead by the splice daemon when in fact there was no problem with the durability service. Solution: use a KEEP_LAST 1 history QoS policy for the writer. |
OSPL-9097 |
DDSI transmit path can lock up on packet loss to one node while another node has crashed A successful retransmit to one remote reader while another remote reader that has not yet acknowledged all samples disappears (whether because of a loss of connectivity or a crash), and when all other remote readers have acknowledged all samples, and while the writer has reached the maximum amount of unacknowledged data would cause the transmit path in DDSI to lock up because the writer could then only be unblocked by the receipt of an acknowledgement message that covers a previously unacknowledged sample, which under these circumstances will not come because of the limit on the amount of unacknowledged data. Solution: deleting a reader now not only drops all unacknowledged data but also clears the retransmit indicator of the writer. |
OSPL-9388 / 17030 |
Durability service might deadlock when the networking queue is
flooded. When the network queue is overrun by the durability service, the normal mode of operation is to sleep a bit and retry again later. However, there is a slight chance that the sending thread of the network service that needs to make room again by consuming elements in the queue will indirectly block on the sleeping thread in the durability service itself. Solution: The network service can no longer indirectly run into a lock that is held by the durability service while the network queue is flooded. |
Report ID. | Description |
---|---|
OSPL-9016 |
Changes caused by OSPL-8914 in 6.6.4 have been reverted OSPL-8914 in the 6.6.4 release has made several changes to the durability service in order to solve problems where a rapid disconnect/reconnect cycle would leave the durability service in an undefined state. In these situations, a disconnect had not yet been fully processed when the reconnect occurs. However, the solutions provided in 6.6.4 caused other, previously non-existing errors during normal operation. Solution: All changes made as part of OSPL-8914 in the 6.6.4 release have been reverted. As an alternative solution to rapid disconnect/reconnect cycle issues, ddsi has offered temporary blacklisting of recently disconnected participants (see OSPL-8956). |
Report ID. | Description |
---|---|
OSPL-8953 |
Potential deadlock between reader creation and durability notification A thread that creates a new DataReader and a thread from the durability service that notifies a DataReader when it has completed its historical data alignment grab two of their locks in reverse order, causing a potential deadlock to occur. Solution: The locking algorithm has been modified so that these two threads do no longer grab both locks in reverse order. |
OSPL-8636 |
Streams throughput example hang on windows When using the streams throughput example on windows the subscriber application could hang on termination. Solution: The defect is fixed and the subscriber will not hang anymore. |
OSPL-8886 |
Durability failure to merge data after a short disconnect When the disconnection period is shorter than twice the heartbeat a durability service may not have been able to determine a new master before the node is reconnected again. In that case no master conflict is generated. In case the durability service is "late" in confirming a master it might even occur that the master has updated its namespace, but the namespace update is discarded because no confirmed master has been selected yet. As a consequence no request will for data will be sent to the master, and the durability service will not be aligned. Solution: In case a durability service receives a namespace update for a namespace for which no confirmed master is selected yet, the update is rescheduled for evaluation at a later time instead of discarding the update. |
OSPL-8914 |
Durability failure to merge data after a short disconnect When a node becomes disconnected it may loose its master. As a result the node will look for a new master. In doing so, the node would first unconfirm its current master and then wait for other fellows to propose a new master. The time to look for a new master is specified in the configuration file (DurabilityService.Network.Heartbeat.ExpiryTime). When the disconnection was shorter than the DurabilityService.Network.Heartbeat.ExpiryTime, no merge is triggered. Solution: whenever a node is discovered that is not simply starting and it has no confirmed master, a merge is triggered, just like when there are conflicting masters. NOTE: This change has been reverted in 6.6.4p1 because it may break other durability functionality. |
OSPL-8920 |
DDSI2 Crash Version 6.6.3p4 introduced a fix for OSPL-8872, taking the sequence number most recently transmitted by a writer when it matched reader into account to force heartbeats out until all historical data has been acknowledged by the reader. The change also allowed a flag forcing the transmission of heartbeats informing readers of the availability of data to be set earlier than before in the case where the writer had not published anything yet at the time the reader was discovered. While logically correct, this broke the determination of the unique reader that had not yet acknowledged all data in cases where there is such a unique reader. This in turn could lead to a crash. Solution: the aforementioned flag is once again never set before a sample has been acknowledged. |
OSPL-8948 / 16755 OSPL-8987 |
Race condition between durability data injection and garbage collecting of empty instances The durability service cached instance handles when injecting a historical data set in a way that could result in the historical samples being thrown away if the instance was empty and no known writers had registered it. Solution: the instance handle is no longer cached.. |
OSPL-8956 |
Temporary blacklisting of remote participants in DDSI2 The DDSI2 service now provides an option to temporarily block rediscovery of proxy participants. Blocking rediscovery gives the remaining processes on the node extra time to clean up. It is strongly advised that applications are written in such a way that they can handle reconnects at any time, but when issues are found, this feature can reduce the symptoms. Solution: A new setting in the DDSI section of the configuration has been added: Internal/RediscoveryBlacklistDuration along with an attribute Internal/RediscoveryBlacklistDuration [@enforce]. The former sets the duration (by default 10s), the second whether to really wait out the full period (true), or to allow reconnections once DDSI2 has internally completed cleaning up (false, the default). It strongly discouraged to set the duration to less than 1s. |
OSPL-8957 |
Unnecessary heartbeat/acknowledgement traffic in DDSI with late joiners for transient-local data In DDSI, reliable writers send heartbeats to inform their readers of the existence of unacknowledged data, and keep doing so until the readers have acknowledged everything. The heartbeats indicate the range of sequence numbers available. The highest sequence number advertised in the heartbeat was the highest sequence number available for retransmit, leaving out any subsequent sequence numbers that are no longer available for retransmit. If a transient-local writer unregistered an instance and then became quiescent, it would be advertising a sequence number less than the latest sequence number it published, and this would lead to a late-joining reader never acknowledging all data up to this latest sequence number. As a result, the writer would keep sending heartbeats and the readers would keep acknowledging it. Sending new samples from the writer would break the cycle. Solution: the writer now advertises the latest sequence number it published. If there is a gap between the latest available for retransmit and the latest published, the reader will be informed that these sequence numbers are no longer relevant using a standard message. |
OSPL-8958 |
DDSI can regurgitate old T-L samples for instances that have
already been unregistered DDSI maintains a writer history cache for providing historical data for transient-local writers and for providing reliability. An instance is removed from this cache when it is unregistered by the writer, but its samples are retained until they have been acknowledged by all (reliable) readers. Already acknolwedged samples that were retained because they were historical data could survive even when the instance was removed. When this happened, a late-joining reader would see some old samples reappear. Solution: deleting an instance now also removes the already acknowledged samples from the history. |
OSPL-8971 |
Catchup policy may incorrectly mark unregistered instances as
disposed. When an instance is unregistered on the master node during a disconnect from another node that has specified a CATCHUP policy with that master, then upon a reconnect that unregister message will still be delivered to that formerly disconnected node. However, the reconnected node will dispose all instances for which it did not receive any valid data, so if the unregister message it the only message received for a particular instance, then its instance will be disposed. Solution:The Catchup policy is now instructed to dispose instances for which it did not receive any valid data OR for which it did not receive any unregister message. |
OSPL-8972 |
Durability fellow state may be incorrect A durability service keeps track of the state of each fellow it knows. In every message that fellow sends to a durability service the state of the fellow is included. The recipient has different threads to handle incoming messages. In rare cases, these messages are processed in the reverse order, thus ending up with a wrong conclusion about the fellow's state. Solution: The durability service now uses the source timestamp in incoming messages to determine the order in which they have been written to ensure it never updates the internal state using 'old' information. |
OSPL-8973 |
Additional durability tracing when verbosity is set to FINEST Durability has been extended with additional tracing in the processing of namespace definitions received from fellows, in particular when checking for master conflicts. |
OSPL-8974 |
Durability conflict scheduling fails when multiple namespaces
have the same policy and differ only in topic names Durability checks for conflicts between fellows (master, native and foreign state) that may require merging data whenever it receives a "d_nameSpaces" instance. If a conflict is detected, it enqueues it for eventual resolution, but only if an equivalent conflict is not yet enqueued. Testing for equivalency is done by checking: conflict kind, roles and local and fellow namespaces. However, the name space compare function (d_nameSpaceCompare) did not take the name into account, nor the full partition+topic expressions. The consequence is that when namespaces A and B have identical policies and differ only in the topic parts of the partition/topic expressions, a conflict for namespace A would be considered the same as a conflict for namespace B. The result would be a failure to merge data in B. Solution: The comparison now takes the name of the namespace into account. The configuration is required to have no overlap between namespaces and to have compatible namespace definitions throughout the system. The name alone is therefore sufficient. |
OSPL-8979 |
DDSI incapable of receiving multicasts after restart in
single-process mode The tracking of joined multicast groups DDSI could not handle the case where DDSI would be restarted in single-process mode (e.g., by creating a participant, deleting it, and creating another one), potentially causing the new sockets not to join the multicast groups. Solution: DDSI now explicitly leaves all multicast groups on termination. |
OSPL-8980 |
With DDSI remote participants expire independently on cable
disconnect The DDSI protocol has lease expiry tied to participants and the DDSI service faithfully implemented this. This means that a cable disconnect occurs caused the leases of the various participants on the remote node to expire independently, and therefore also the automatic disposing and unregistering of data. A short disconnection where the lease of the durability service never expired, but where the lease of some application process did expire could lead to an inconsistent state of the data space, if that application published auto-disposed transient data. Solution: DDSI has been modified to implement only a single lease per remote federation for OpenSplice peers by internally tying the leases of the applications to the lease of the remote DDSI service. |
OSPL-8984 |
DDSI handling of non-responsive readers needs improvement When a writer is blocked for ResponsiveTimeout seconds, DDSI will declare the matching proxy readers that have not yet acknowledged all data "non-responsive" and continue with those readers downgraded to best-effort. This prevents blocking outgoing traffic indefinitely, but at the cost of breaking reliability. For historical reasons it was set to 1s to limit the damage a non-responsive reader could cause, but past improvements to the handling of built-in data in combination with DDSI (such as fully relying on DDSI discovery for deriving built-in topics) mean there is no longer a need to have such an aggressive setting by default. Solution: The default behaviour has been changed to never declare a reader non-responsive and maintain reliability also when a remote reader is not able to make progress. The changes also eliminate some spurious warning and error messages in the log files that could occur with a longer timeout. |
OSPL-8989 |
Non-atomic dispose+unregister operation on DCPSHeartbeat A race condition between auto-disposing/unregistering, taking data from a data reader and merging of historical data was resolved in OSPL-8684 by performing the an atomic dispose+unregister operation instead of two separate operations. This fix covered all cases except the DCPSHeartbeat built-in topic, which is handled specially by the splice daemon and still performed two independent operations. Solution: The atomic dispose+unregister is now also used for DCPSHeartbeat. There are no user-visible consequences of this change. |
TSTTOOL-395 |
Python scripting examples need to be updated to reflect the
ability to create and specify QoS settings on entities. the example now shows:
|
Report ID. | Description |
---|---|
OSPL-8920 |
DDSI2 Crash Version 6.6.3p4 introduced a fix for OSPL-8872, taking the sequence number most recently transmitted by a writer when it matched reader into account to force heartbeats out until all historical data has been acknowledged by the reader. The change also allowed a flag forcing the transmission of heartbeats informing readers of the availability of data to be set earlier than before in the case where the writer had not published anything yet at the time the reader was discovered. While logically correct, this broke the determination of the unique reader that had not yet acknowledged all data in cases where there is such a unique reader. This in turn could lead to a crash. Solution: the aforementioned flag is once again never set before a sample has been acknowledged. |
OSPL-8914 |
Durability failure to merge data after a short disconnect When a node becomes disconnected it may loose its master. As a result the node will look for a new master. In doing so, the node would first unconfirm its current master and then wait for other fellows to propose a new master. The time to look for a new master is specified in the configuration file (DurabilityService.Network.Heartbeat.ExpiryTime). When the disconnection was shorter than the DurabilityService.Network.Heartbeat.ExpiryTime, no merge is triggered. Solution: whenever a node is discovered that is not simply starting and it has no confirmed master, a merge is triggered, just like when there are conflicting masters. |
OSPL-8886 |
Durability failure to merge data after a short disconnect When the disconnection period is shorter than twice the heartbeat a durability service may not have been able to determine a new master before the node is reconnected again. In that case no master conflict is generated. In case the durability service is "late" in confirming a master it might even occur that the master has updated its namespace, but the namespace update is discarded because no confirmed master has been selected yet. As a consequence no request will for data will be sent to the master, and the durability service will not be aligned. Solution: In case a durability service receives a namespace update for a namespace for which no confirmed master is selected yet, the update is rescheduled for evaluation at a later time instead of discarding the update. |
Report ID. | Description |
---|---|
OSPL-8884 |
Spliced assertion failure (_this->state & L_UNREGISTER) in v_registrationMessageCompare The record of a writer unregistering an instance is retained for a little before it is completely removed from the system. In the case of spliced and DCPSHeartbeat, if in this window the spliced needed to auto-dispose the DCPSHeartbeat, it could resurrect the registration but leave it in an inconsistent state, resulting in this assertion failure. Solution: DDSI now correctly tags the "unregister" events. |
OSPL-8825 / 00016727 OSPL-8722 / 00016668 |
Invalid timing during replay with non-default speed by Record and Replay service A regression caused by a previous fix caused the Record and Replay service to not compensate for a non-default replay speed in certain circumstances during replay. Solution: The issue was resolved by making some adjustments in the replay timing algorithm so the replay-speed is always factored in. |
OSPL-8684 / 00016645 |
Instances that were NOT_ALIVE due to a disconnect do not always become ALIVE after a reconnect When a disconnect occurs, all instances owned by the disconnected Writers should become NOT_ALIVE_DISPOSED (in case a disconnected Writer had its auto_dispose_unregistered_instances set to TRUE) or NOT_ALIVE_NO_WRITERS. However, when the disconnected node reconnects, and the topic is non-VOLATILE, you might expect the instances to go back to the ALIVE state. However, that did not happen in all scenarios. Solution: If the NOT_ALIVE instances are taken by the DataReader, they will now instantly be purged. When merge policies are configured to re-insert the missing messages back into the DataRerader after the connection is re-established, then these re-inserted samples will cause the instance to be re-created starting in the NEW, NOT_READ, ALIVE state. Note however, if the NOT_ALIVE samples are not taken out at the time of the disconnection, their instances will not be purged, and the re-inserted samples will be discarded as being duplicates of samples that are already there. In those cases the instance state will still not be changed back to ALIVE. |
OSPL-8857 / 00016738 |
Unable to start OpenSplice Tuner on Windows An incomplete classpath in the manifest of ospltun.jar causes a ClassDefNotFound exception when starting the Tuner. Solution: The issue was resolved by fixing the classpath handling in manifest file generation. |
OSPL-8872 |
DDSI transient local data retrieval may block indefinitely when packet loss occurs halfway through a fragmented
sample if the writer is quiescent When the DDSI service is retrieving historical transient-local data from a remote writer (this includes DDSI discovery data), it depends on a constant back-and-forth between requests for retransmits (NACKs) and heartbeats from the writer that allow it to send NACKs. The writer adapts its heartbeat rate based on the state of the readers it surmises from the ACKs/NACKs it receives. On receipt of an ACK it concludes the reader must have received all data (otherwise it should have NACKed the missing samples), which may result in no more heartbeats going out. Requesting a retransmit of individual sample fragments is done using a message different from the standard NACK message. If a reader has received part of a fragmented sample, it will send such an alternative NACK for the missing fragments while sending an ACK for everything up to that point. This ACK caused the writer to incorrectly conclude the reader had received everything. Writing a single new sample would cause the back-and-forth to restart and continue until all data had been transferred. Discovery of a new reader would temporarily trigger heartbeats as well, restarting the sequence, but in a state that further packet loss could cause it to stop again. Solution: the writer requires that the ACK concerns a high enough sequence number before concluding it need not send any more heartbeats. |
OSPL-8847 |
When coherent transactions containing topics with strings as key are being aligned, the receiving durability service
could crash Durability services are used to align historical data. In case coherent transactions exist, special messages called End-Of_transactions (EOT) are aligned to indicate which parts of the transaction is completed. If topics are aligned that have a string as key and that belong to a transaction, then the receiving durability service could crash because it tries to access unallocated memory. Solution: The code that was responsible for the illegal memory access is fixed. |
OSPL-8803 |
Durability tracing logs do not mention the OpenSplice version number The durability tracing log generated by the OpenSplice durability service (when tracing is enabled) does not mention the OpenSplice version number. Solution: The OpenSplice version number is now added to the durability tracing log. |
OSPL-8819 |
Memory rise with transient local data using DDSI When using transient local data in combination with DDSI, DDSI starts to allocate memory and never frees it. Solution: When using transient local, DDSI did not request the receiver site to acknowledge the sent data, but still kept it in a list, allocation more and more memory for each message sent. DDSI has been changed to request acknowledgment when using transient-local data. |
OSPL-8771 / 00016706 |
Memory leak after deleting a transient reader A reference counting error in determining whether a reader is entitled to transient data caused part of the data reader administration to be retained after deleting the data reader. Solution: The stray reference is now released. |
OSPL-7244 |
Tuner does not support writing samples containing multi-dimensional collections The Tuner tool does not support editing multi-dimensional sequences (in IDL, sequence<sequence<some_type>>, or some_type[x][y] ). When editing such data fields in the Tuner writer table, the fields will be uneditable. This also affects editing Google Protocol Buffer samples that contain a field defined as repeated bytes, as that is represented as a sequence of a sequence of octets in the data model. Solution: Support for editing certain multidimensional collections has been added, specifically editing of two dimensional unbounded sequences of primitives, such that the Google Protocol Buffer type "repeated bytes" is able to be edited. Other more complex types such as N-dimensional bounded/unbounded sequences or arrays of primitives or of structs are still not editable. |
OSPL-8804 / 00016720 |
Late joining DataReader receives 1 sample per DataReaderInstance per unfinished transaction at maximum When a transaction on a transient topic contains more than 1 samples per instance, and the transaction is not yet finished at the time a late joining Volatile Reader requests historical data, then the reader will receive only the first sample for each ReaderInstance in that transaction, regardless of the Reader's history depth. Solution:The Volatile Reader will now receive all samples of the unfinished transaction. |
Report ID. | Description |
---|---|
OSPL-6961 |
Cause of an unregister not correctly tracked when a DDSI lease expires There are two reasons why an "unregister" of an instance can take place: the first, most common one, is that a writer explicitly calls unregister_instance, or that it automatically happens when the writer is deleted; the second one is when a remote writer suddenly disappears and the other nodes have to synthesize one. The two are indistinguishable to applications, but the way the must be treated for transient data following a reconnect is different. DDSI did not properly distinguish between the two cases. Solution: DDSI now correctly tags the "unregister" events. |
OSPL-8563 |
OS_INFO payloads from Node Info samples contain invalid XML
character on Windows 10 64bit, Java 8 When publishing OS_INFO information using NodeMonitor on Windows 10 in conjunction with Java 8, a form feed XML character is supplied by the underlying Sigar library, which cannot be processed by Vortex OpenSplice and throws the following exception:[error] o.o.a.c.t.xml - SAXException: Character reference "" is an invalid XML character. Solution:The NodeMonitor implementation now strips out the invalid XML form feed character for the OS_INFO description before publishing the data. |
OSPL-8585 |
Display the Topic Type definition wherever a GPB evolution can
be selected Tuner does not support selecting a protobuf type evolution yet when writing/reading. Solution: Tuner's support for Google protocol buffers has been updated with the ability to directly view the type definition of the selected type evolution when choosing a type evolution to read/write data as. |
OSPL-8666 / 16651 |
Reading with QueryConditions could yield samples that shouldn't
be read based on the masks provided. For instance, when an QueryCondition has the mask NOT_READ_SAMPLE_STATE, only samples with that state should be returned when reading or taking with that condition. However, the query implementation did not include testing the masks for trigger values. This means that when using a triggered QueryCondition with the NOT_READ_SAMPLE_STATE, you'd be able to read samples with the READ_SAMPLE_STATE. This was also applicable for the view state and instance state. Solution: Add masks checks for triggered QueryConditions. |
OSPL-8691 |
DDSI not properly retransmitting end-of-transaction messages. When using coherent updates, an end-of-transaction message is used to notify the subscribers that the set is now complete and may be committed. These messages may get lost and hence may need to be retransmitted, just like ordinary samples. The retransmit path of DDSI however failed to handle these correctly. In consequence, the subscribing side would automatically reconstruct an end-of-transaction message at a later stage, one that is sufficient for guaranteeing topic-level coherence, but not for group-level coherence. Solution: DDSI has been updated to properly retransmit end-of-transaction messages when required. |
OSPL-8708 |
Tuner cannot acquire key list for protobuf topics whose type
contains a oneof. If the user creates a reader or writer frame for a topic whose type is a protobuf type containing a oneof declaration, the reader or writer frame will show a warning in the status bar that the key list could not be obtained. Consequently, the user data table would not have any highlighting for key or protobuf required fields. Solution: The underlying API that acquires protobuf specific properties for user data field (such as the required flag or default values) did not properly account for the switch field, which is an invented field for the mapping of oneof to union for representation in existing tooling data models. The tooling protobuf API was fixed to account for this switch field. |
OSPL-8710 / 16663 |
Issue with java waitset during termination of domain. When a java application is blocking on a waitset _wait() call, while at the same time OpenSplice is stopping, the application may run into an ArrayIndexOutOfBoundsException. All conditions in the waitset are supposed to trigger because they are detached from the domain, but the java binding does not properly resize the ConditionSeqHolder array supplied by the application, to contain all conditions. Solution: To resolve the issue a check on the array size was added, resizing if required. |
OSPL-8724 |
A crash could occur when the system terminates while there are
pending events to process. Events are handled asynchronously. In the exceptional case where the system terminates while there are pending events, these events must be cleaned up. A bug could cause that an unmanaged piece of memory is accessed, which potentially causes a crash. Solution: When the system terminates while there are pending events, these events are now being cleaned up correctly. |
OSPL-8728 |
Unknown object kind error log when using isocpp2 listener. When using the isocpp2 API with listeners an Unknown object kind error log shows up when listening for the DATA_AVAILABLE status event. Solution: The isocpp2 listener mechanism is fixed and the error will not show anymore when using DATA_AVAILABLE status. |
OSPL-8736 |
Registering the same type twice in c sharp raises bad parameter. There was a code path in the C# API in which registering a type that was already known did not set the correct return value, causing a successful call to raise an exception. Solution: Added handling of the result code and return OK when all went well. |
OSPL-8737 |
Lag during replay by Record and Replay service. TWhen a reasonable load is placed on the RnR service, i.e. by replaying a storage with many samples recorded at a high frequency from different topics and/or writers, the service may not be able to keep up with the exact original 'recorded' timing and a sample is replayed with a delay. The delays of individual samples add up, resulting in a noticeable delay sooner or later, as the replay carries on. Solution: The replay timing has been improved to compensate for delays on individual samples, catching up lost time so the replay in general doesn't lag behind. |
OSPL-8738 |
Lease handling in the DDSI service is sensitive to time jumps. The DDSI service's handling of the leases of remote participants was still based on the wall clock time, and therefore a forward jump of the wall clock by more than a few seconds could cause lease expiry, and hence disconnections. Similarly, a backward jump of the wall clock time could delay lease expiry. Solution: The lease handling is now based on a monotonic clock that counts time elapsed since an arbitrary reference in the past. |
Report ID. | Description |
---|---|
OSPL-7245 |
Enable highlighting of required fields in GPB user data. Google Protocol Buffer samples in Tuner do not highlight so-called required attributes. This makes it difficult to see what needs to be edited before samples can be written. Solution: The Vortex OpenSplice Tuner UI has been updated to distinguish between key, required and optional fields for Google Protocol Buffer samples. Required user data fields in sample tables are highlighted with the cyan colour in the same fashion as in Vortex OpenSplice Tester. |
OSPL-7392 OSPL-8450 |
The durability service does not properly align open coherent transactions. As soon as a durability service joins an existing system it will retrieve all non-volatile historical data that is available within the system by requesting this data from an available durability service (the master). The master durability service should provide the historical data to the newly joined durability service (this is called alignment). Unfortunately, the master durability service did not align the non-volatile historical data that belongs to a coherent transactions that is not committed (finished) yet. In particular, the following problems were identified:
Solution: Data that is part of a non-finished coherent transaction is now being aligned, just as the end-of-transaction markers. This allows the late joining durability service to reconstruct the same state as the master, and consequently completeness of transactions is handled correctly in various disconnect scenarios. |
OSPL-8102 |
Minimal configured database size required If a database size is configured below the minimal database size Opensplice will not start and stop with out of memory problems. Solution: If the configured database size is below the minimal required size, it will be increased to the minimal required size and a warning trace will be logged. Minimal database size for 32-bit builds is 2 MB and for 64-bit builds it is 5 MB. The default memory threshold size for free memory will be 10% of the configured database size if the configured database size is less than the default 10MB. |
OSPL-8490 / 16602 |
Compatibility issues with dcpssacsAssembly.dll When working with the dcpssacsAssembly.dll provided with the C# API there can be compatibility issues with Visual Studio versions. This is because the dcpssacsAssembly.dll is created with the use of the .NET framework 2.0. Solution: The dcpssacsAssembly.dll is now generated with the corresponding Visual Studio .Net Framework compiler. |
OSPL-8528 |
Closing a DDS entity containing closed entities
intermittently results in an AlreadyClosedException When concurrently calling close methods on entities, the Java5 API may throw an AlreadyClosedException in case one of the children of the offending Entity already has been closed. Solution: The implementation of the various close methods has been modified to be able to deal with concurrent closing of multiple entities by the application. |
OSPL-8544 |
Ordered_access for GROUP scope does not handle lifespan expiry
correctly. When a sample's lifespan has been expired in a Reader that belongs to a Subscriber with GROUP Presentation scope, it is not removed from the Reader cache at the start of a coherent_access block. Because of the fact that during a coherent_access block reader caches will be locked for any type of modification, the get_datareaders() function of the Subscriber and the read/take functions on individual Readers can still access samples that should have been expired. Solution: The start of a coherent_accesss block on the Subscriber will first test the contents of its Reader caches for expired samples. All samples that have been expired will be purged prior to locking their Reader caches. |
OSPL-8566 / 16634 |
Deployment Guide vs Configuration Guide In the set of documents delivered in the pdf directory of each OSPL version appears the Deployment Guide. This document refers about 56 times to "the separate Vortex OpenSplice Configuration Guide". The references should point to section 12 of the deployment guide itself instead Solution: The references in the document have been updated. |
OSPL-8576 |
TypeEvolution chooser should be hidden when the Google Protocol
Buffers feature is disabled for the platform. In Vortex OpenSplice Tuner, when choosing to read or write data from a GPB-defined topic, the user is prompted with a dialog window to choose which type evolution to view the data as. If the Vortex OpenSplice installation does not have the Google Protocol Buffer feature enabled, then Tuner should not prompt the user for type evolution choice, since it wouldn't affect anything. Solution: If the Google Protocol Buffer feature for Vortex OpenSplice is not included in the installation, then the type evolution choice in Vortex OpenSplice Tuner will not appear. |
OSPL-8583 |
Possible memory leak when reader is removed with uncommitted
transactions When a reader was removed that still had unread complete transactions it was possible that the transaction leaked. Solution: Updated the internal transaction administration so that when a reader is removed so are its transactions. |
OSPL-8596 |
Wrong dispose in protobuf example In the protobuf example, the 'Person' protobuf type has 2 key fields : name and worksFor.name. The publisher code first writes a sample with name="Jane Doe" and worksFor.name="Acme Corporation". Then it is supposed to dispose this instance but in fact disposes another instance as it only sets the name field to "Jane Doe" but does not set the worksFor.name field. Solution: The worksFor.name is now also set for the dispose. |
OSPL-8609 |
When coherent group access is used a resource claim regarding the
resource limits of the group is performed twice. When coherent access is used then the resource counters maintained in the group regarding the resource limits set on the group are decremented twice. This may cause samples loss because of the incorrect resource counters in the group. Solution: The group resource claim on the resource limit is performed only once for each coherent transaction. |
OSPL-8618 |
When the durability client receives an event to request
historical data while the subscriber is being destroyed, the
durability client could crash when it tries to handle the event. When the durability client receives an event to request historical data from a reader, the durability client will asynchronously determine the partitions for which data must be requested from the reader's subscriber. In case the subscriber has just been destroyed then the durability client tries to determine the partitions from a non-existent subscriber. This inevitably leads to crash. Solution: Before determining the partitions from the reader's subscriber a check is added to see if the subscriber exists. If the subscriber exists it is made sure using refcounting and locking that the partitions can be obtained, even if another thread is in the process of destroying the subscriber. |
OSPL-8640 |
C++ RMI applications may crash on Windows when compiled with VS 2015 C++ RMI applications may crash on Windows platforms when compiled with the Visual Studio 2015 compiler. The problem is that constructors of classes call the default constructor of its virtual parent class explicitly. However, that default constructor is not implemented. Solution: These explicit calls to default constructors have been removed. |
OSPL-8643 |
A durability service that cannot act as aligner for a namespace
does not merge when its fellow aligner re-appears within the expiry
time A durability service can be configured as non-aligner (alignee) for a namespace, in which case it is dependent on a fellow durability service (master) to provide historical data for this namespace. When the alignee looses its master (e.g., due to a disconnect) there is no durability service to provide historical data for this namespace, and the namespaces of the aligner and aligner may diverge. When a reconnect appears the master suddenly becomes available again, and the alignee should trigger a merge action to resolve the potentially diverged state. This was not happening in case a master appears with the expiry time. Consequently, the alignee may end up in a diverged state. Solution: When a master of an alignee disconnects from the system, the namespace state is cleared and its master is reset. As soon as an aligner appears a master conflict is detected and a merge will be triggered. |
TSTTOOL-372 / 16541 TSTTOOL-379 / 16578 |
Python Scripting Engine should support reading or writing to
non-default partitions The python scripting engine was connecting to on the default partition (the partition with an empty name) for all readers and writers. Solution: With this fix, the scripting engine connects by default using the partition name pattern '*', which is consistent with Tester's behaviour. In addition, it is now possible to explicitly create subscribers and publishers, and to explicitly provide the desired partition name/pattern. |
TSTTOOL-373 / 16541 TSTTOOL-378 / 16578 |
Python Scripting Engine should support setting QoS settings on
entities The Python scripting engine did not allow specification of Quality of Service (QoS) policies when creating entities Solution: all readers and writers were configured with default policies. With the fix, it is now possible to specify QoS policies on the following entities: readers, writers, publishers and subscribers. |
TSTTOOL-391 |
Python Scripting Engine should support WaitSets The python scripting engine did not permit creation of wait sets. Solution: With this fix, a WaitSet class has been implemented, and it is possible to add Read Conditions to the wait set. Note that the WaitSet implementation does not yet support Status conditions or Query conditions. |
Report ID. | Description |
---|---|
OSPL-8547 / 16628 |
Topic definitions from KV-persistent store are not announced to other nodes Topic definitions from the kv-store are not announced to other nodes. If no application on that other node has already registered this type definition the situation can occur that the durability service will not reach the operational state. Solution: The kv store will announce the topic definitions found in the persistent store to all other nodes |
Report ID. | Description |
---|---|
OSPL-7457 |
Interoperable transient-local support. Transient-local was handled as transient data, but in combination with running DDSI as network protocol, data sometimes got delivered multiple times. Solution: In cases where DDSI is the network protocol, historical transient-local data is delivered by the DDSI2(E) service and no longer by the durability service. |
OSPL-7515 / 15645 |
When the merge policy for a namespace that is responsible for
builtin topics is different from IGNORE or MERGE, the system state
can become inconsistent. In the configuration a user can specify merge policies for namespaces. These merge policies specify how to resolve state diverge due to disconnections. If the merge policy that is specified for the namespace that is responsible for the builtin topics is different from IGNORE or MERGE then the internal state of the system can become inconsistent. This is undesirable and should be considered as an invalid configuration. Solution: The durability service now terminates when the merge policy that is specified for the namespace that is responsible for the builtin topics is different from IGNORE or MERGE. Termination will only occur when the durability service is responsible for builtin topic alignment. |
OSPL-8044 |
Undesired timeout on wait_for_historical_data without durability When durability is not enabled, a wait_for_historical_data call would still block during the timeout-period, even though no historical data will be delivered to a reader without a durability service. Solution: A check was added to determine if durability is used. If not, wait_for_historical_data will return precondition_not_met instead of blocking and returning timeout. |
OSPL-8306 / 16511 OSPL-8096 / 16183 OSPL-8313 / 16513 OSPL-8314 / 16514 OSPL-8315 / 16512 |
The networking service may crash when a reconnect occurs under
high load conditions. When the networking service determines that another has not reacted in time and is removed from the reliable protocol and shortly thereafter the just removed node is reconnected it may occur that some variables related to this node are not correctly initialized to their default values. Solution: The state variables that the networking service maintains for the participating nodes are set to their default values when the node is removed from the reliable protocol. This ensures that when a node is considered alive again that the node related information has the correct state. |
OSPL-8310 |
Statistics not enabled when not explicitly setting enabled
attribute to true. When not explicitly configuring the enabled attribute of the //OpenSplice/Domain/Statistics/Category element and setting it to true, the statistics for that category are not enabled. Solution: When not configuring the enabled attribute it is now interpreted as having the true value set. |
OSPL-8311 |
ThrottleLimit configuration does not accept human-readable sizes as input. The //OpenSplice/NetworkService/Channels/Channel/Sending/ThrottleLimit provides the lower bound for throttling and is really a size in bytes. Until now this value would not accept human-readable sizes. Solution: Human-readable sizes can now be used as input for ThrottleLimit |
OSPL-8391 / 16538 OSPL-8462 / 16573 OSPL-8467 / 16579 |
The reception of a dispose message may cause a crash when the
corresponding instance has been removed. When a dispose message is received the corresponding datareader instance is lookup using the corresponding group instance. However in the mean time the datareader instance may have been removed. The dispose message will than trigger the creation of a new datareader instance but at that time the key information present in the dispose message is not available which may cause access to an invalid pointer value. Solution: When the datareader receives a dispose message for an instance that is removed the dispose message will be dropped because it will not result in any state change. |
OSPL-8439 / 16570 |
Include dependency for constant declarations missing in idlpp. When declaring a constant in an IDL file whose type is taken from an included IDL file, idlpp forgets to take the include dependency into account. This means that the generated code will refer to a type from another generated file, but the first file will not include the second file, resulting in compilation errors. Solution: The missing dependency has now been added to idlpp. The type of a constant declaration will now be properly included if located in another file. |
OSPL-8477 / 16597 |
Registering a protobuf-modelled type multiple times leaks memory When registering a type, its meta-data gets stored as well to allow tools to generically create datawriters and datareaders without the need for pre-compiling the type information into the tools. When registering a protobuf-modelled type with the same DomainParticipant multiple times, the definition leaks away though due to a missing deallocation. Solution: the missing deallocation of the duplicate type in the type registration algorithm has been added. |
OSPL-8480 / 16599 |
Ordered_access does not always suppress invalid samples correctly. When using ordered_access when reading/taking data, some additional (invalid) samples could appear that you would normally not see when reading from a similar Reader that is not using ordered_access. This is caused by the fact that normal DataReaders suppress invalid samples when they can mask those invalid samples by valid samples for the same instance. However, in case of ordered_access, this suppression mechanism was not being applied. Solution: The suppression mechanism for invalid samples is now also applied to DataReaders that use ordered_access. |
OSPL-8488 / 16604 |
Coherent access in combination with resource limits can results
in not receiving sample updates. When using coherent access in combination with resource limits and max samples per instance is set to 1 it could be that samples are not being received by the datareader. Solution: The defect in the datareader mechanism is resolved and samples are being properly received. |
OSPL-8497 / 16613 |
ArrayIndexOutOfBoundsException can occur during termination when
using waitsets in the Java API. When using the Java API in combination with waitsets an ArrayIndexOutOfBoundsException can occur during termination of the middleware. Solution: The defect in the waitset mechanism is resolved and the exception will not occur anymore. |
OSPL-8508 |
DCPSPublication samples maintained longer than necessary. In some cases, samples for the DCPSPublication topic were maintained longer than necessary by spliced. Even though eventually spliced would reclaim the memory, this appeared as if there was a memory leak. Solution: The processing of the samples by spliced has been changed, so that cleanup happens as soon as possible instead of deferred. |
TSTTOOL-369 / 16350 |
Tester in Jython does not support float.Nan values in check method In the provided Python Scripting Engine example, a Topic's float member with a NaN value could not be checked and the check function would return false. Solution: The _checkPyobj(pyobj, checkValues, logFailures = False) function in $OSPL_HOME/tools/scripting/examples/tester_compat.py has been modified to handle this case. |
TSTTOOL-376 / 16517 |
Install Jython for ospltest scripts. When installing Jython on Windows platforms following the instructions in the Tester User Guide, users would be unable to install the library due to spaces in the OSPL_HOME environment variable. Solution: Updated the documentation to have quotes around the environment variables that could have spaces on Windows. |
Report ID. | Description |
---|---|
OSPL-8139 |
Durability creates defaultNamespace when configuration is invalid. The durability service can silently ignore an invalid configuration (i.e. no namespaces and/or policies are included, or are not consistent). In that case the service applies the default configuration, i.e. a defaultNamespace with initial alignment and alignee policy. Since namespaces are also exchanged with other durability services in a domain, the resulting behaviour of the mis-configured node, but also other nodes, may be far from what the user expects. Solution: Instead of silently applying default configuration parameters, the service now refuses to start in case the configuration is missing or invalid, reporting relevant errors to the OpenSplice error log. |
OSPL-8221 / 16391 |
DataReader may crash when using ordered_access. The administration of a DataReader that has a PresentationQos with ordered_access set to TRUE may become corrupted when all available samples are taken out of the DataReader. This corruption might in turn result in a crash of the publishing (when on the same federation as the corrupted reader) or subscribing application. Solution: The ordering mechanism has been fixed so that taking the last sample out of a Reader with ordered_access set to TRUE can no longer corrupt the Reader's administration and therefore no longer crash your applications. |
OSPL-8234 |
Samples written without begin_coherent_changes broke ordering Samples (including implicitly created samples like unregister on writer deletion) written with a coherent writer without calling begin_coherent_changes first could be delivered out of order and during a begin_access. Solution: For a coherent writer every sample is now part of a transaction. A write without a previously called begin_coherent_changes now does this implicitly. This ensures in order delivery and samples are no longer delivered after calling begin_access. |
OSPL-8288 / 16504 OSPL-8289 / 16503 |
The use of the newly introduced detachAllDomain functionality
occasionally resulted in processes blocking or crashing when
detaching and terminating. The problem was caused by two reasons:
Solution: The ordering mechanism has been fixed so that taking the last sample out of a Reader with ordered_access set to TRUE can no longer corrupt the Reader's administration and therefore no longer crash your applications. |
OSPL-8442 OSPL-8435 / 16568 OSPL-8427 / 16564 |
The method detach_all_domains may fail when a listener callback
is blocking. The listener callback is called from shared memory context which means that the thread that is executing the listener callback is registered as using shared memory resources. The detach_all_domain method will try to have all threads that are executing from shared memory context to either leave the shared memory context or to block which would allow the shared memory to be detached safely. Thus when the listener callback is blocking in the application context the detach_all_domain method will fail with a timeout because this thread is still listed as using shared memory resources. Solution: When an application listener callback has to executed the thread that has to call this callback first releases the shared memory resources it is using and leaves the shared memory context before calling the callback. |
OSPL-8443 |
The function DDS_wait_for_historical_data_w_condition() does not
handle the special value DDS_TIMESTAMP_INVALID correctly. The function DDS_wait_for_historical_data_w_condition() allows an application to provide conditions on the historical data to request. In particular, timestamps to indicate the interval when data was produced can be specified. The special DDS_TIMESTAMP_INVALID can be used for both the lowerbound and upperbound of the interval and should be interpreted as don't care. Due to an error these special values where not handled correctly, causing these values to be interpreted as invalid parameters and causing the DDS_wait_for_historical_data_w_condition() to return with BAD_PARAMETER return code. Solution: The error is fixed, so DDS_TIMESTAMP_INVALID is now considered a valid value. Note that this issue has currently only been fixed for the C language binding. Other language bindings will be fixed in a future release. |
Report ID. | Description |
---|---|
OSPL-8205 / 16386 |
Exception in thread ListenerEventThread causes NullPointerException. When using the Java API with listeners a NullPointerException can occur when deleting an entity which has a listener attached. Solution: The cause of the fault is still unclear but a workaround has been implemented to catch the exception so the listener will terminate correctly. When this happens a trace will also be added to the ospl-error.log |
OSPL-8388 |
dbmsconnect example configuration fails to get open with configurator The dbmsconnect example XML configuration contains a '<>' as part of an SQL filter expression, but this needs to be escaped in XML for the XML to be valid. Even though dbmsconnect is able to cope with this file, the configurator tool is not accepting this invalid XML. Solution: The '<>' statement has been escaped in the XML file as '<>'. |
Report ID. | Description |
---|---|
OSPL-6586 |
Coherent updates are not always delivered to DataReaders for a
ContentFilteredTopic. ContentFilteredTopic DataReaders filter out samples that do not match the filter. Samples that are filtered out purely on key-value do not even get offered to the DataReader as they are filtered out a bit earlier in the process. For coherent sets, the DataReader, even though filtering out the sample, still needs to receive the sample to determine completeness of a coherent set as it counts all samples and compares this count with the expected number provided by the publisher. Therefore not receiving all samples results in an incorrect count and the complete coherent set not being made available to the application. Solution: For ContentFilteredTopic DataReaders, all samples are now delivered for counting purposes but immediately dropped afterwards so that no resources are consumed but completeness can be verified. Although samples are dropped, some additional administration is required to keep track of the data completeness and as a consequence some extra memory is kept during the time the coherent set is not completed yet. |
OSPL-7601 |
DDSI2 service lacks support for transient-local history settings
of KEEP_LAST with depth > 1. The DDSI2 service maintains transient-local data in accordance with the DDSI specification, but only implemented support for KEEP_LAST with depth 1. For OpenSplice this is more-or-less a non-issue as the durability service handles the history correctly anyway, but when interoperating this could be a restriction. Solution: The history maintained by DDSI now fully supports all history QoS settings. |
OSPL-7893 |
The durability service may select conflicting masters after a
reconnect. When a reconnect occurs the durability service has to determine a new master. When a reconnect occurs the master selection procedure will determine a new master in a number of rounds where information is exchanged between the durability fellows to reach conformity about the selected master for each namespace. However when the durability service is in the complete state the master selection procedure selects a master to early which causes that the durability fellows may select different masters causing the potential merge of historical data to fail. Solution: When in the complete state and a reconnect occurs, the master selection now waits during each stage of the master selection until all information has been received from the other fellows about their master selection. |
OSPL-8114 |
When publishing a coherent set during durable data alignment,
transactions may never become complete. When publishing a transaction during alignment of durable data it is possible that a transaction never became complete as it could receive parts of the transaction in twofold, each received part of the transaction was counted and used to determine completeness. In this case the transaction never became complete as the expected count was lower then the actual count. Solution: The transaction mechanism has been updated to handle duplicate messages. The internal group must be complete before the readers receive the EOT markers. |
OSPL-8126 |
When the durability service terminates while rejected samples
still need to be delivered to the readers, then termination is
stalled until the samples are delivered. When the durability service receives alignment data it tries to deliver the data in the readers. When delivery is rejected (e.g., because the reader has reached its resource limits) then the durability service will periodically retry to deliver the data. In case the durability service must terminate while periodically retrying to deliver rejected samples, then termination will only succeed after all samples have been delivered successfully. This negatively affects responsiveness of the durability service in case of a termination request. To improve responsiveness the durability service should not wait until all samples have been delivered when it is requested to terminate. Solution: The durability service does not deliver rejected samples anymore when termination occurs. |
OSPL-8136 |
Possible missing of async RMI replies when server exits and starts again A C++ RMI client may miss replies from a server when a server replies to an asynchronous request and exits, then it is re-started again to process other requests whereas the client remains active. The reply instance is then disposed and as the client takes the replies, it receives an invalid sample that disturbs the replies management and makes RMI miss the following valid replies. Solution: Invalid samples for the RMI replies datareader have been disabled at the client side. |
OSPL-8152 / 16371 |
Crash in the shared memory monitor thread due stacksize
limitation. When using a datamodel with a lot of indirections and nesting and experiencing an application, the shared memory monitor crashes too, because it runs out of stack space during its attempt to clean up application resources. Solution: It is now possible to configure the stack size for all splice daemon threads found in the configuration section //OpenSplice/Domain/Daemon. Each thread now has an additional element "StackSize" which can be set to the desired size. An example to set the shmMonitor stack to 1 Mb: <shmMonitor><StackSize>1048576</StackSize></shmMonitor> When this element is not specified the thread will use the default size of 512Kb. |
OSPL-8172 |
PublicationInfo for group coherent writer notification unsafe. When the splice daemon receives a PublicationInfo message that describes a coherent writer, a notification is sent to the publisher in an unsafe way, which caused potential multithreaded manipulation of the internal cached writer administration in the coherency mechanism. Solution: Made the notification threadsafe. |
OSPL-8185 / 16385 |
Potential crash during simultaneous creation and deletion of a
shared DataReader/Subscriber There is a race condition in the implementation of shared DataReaders/Subscribers, where if you simultaneously delete the last occurence of a shared Reader/Subscriber while at the same time creating a new reference to it, the system might end up crashing. Solution: The race condition has been removed. It is now possible to safely remove the last occurence while at the same time creating a new reference to it. |
OSPL-8189 |
Group coherent reader can access subscriber after deletion. After deletion of the subscriber a group coherent reader may still try to access the subscriber, which causes a crash. Solution: The data reader has a back-reference to the subscriber , which is now removed in a threadsafe manner and checked before trying to access the subscriber. |
OSPL-8191 |
After calling begin_access, historical data could still be
delivered to a group coherent reader. Reader creation retrieves historical data with the begin_access (read)lock instead of with the access (write)lock. This can cause data to change after a begin access in a multithreaded environment. Solution: Historical data is now retrieved within the access (write)lock. |
OSPL-8208 |
C# API Waitset.wait causes NullReferenceException. When using the C# DDS API with waitsets a NullReferenceException may occur when using the Waitset.wait call. Solution: The fault in the waitset handling for C# is fixed. |
OSPL-8212 / 16390 |
RTI/TwinOaks interoperability issue with
"autodispose_unregistered_instances" QoS RTI and TwinOaks have a different interpretation of the "autodispose_unregister_instances" writer QoS, which could cause unexpected disposes of instances in OpenSplice when a RTI or TwinOaks writer disappears. Solution: The DDSI2 service has been modified to distinguish between ADLINK and non-ADLINK peers when defaulting the QoS setting, eliminating these unexpected disposes. |
OSPL-8222 |
Multi-partition Publishers don't disconnect properly from all
targeted partitions. Publishers that are connected to multiple partitions can cause problems when they need to disconnect from one or more of their partitions (for example because the PublisherQos is changing one or more partitions or because the writer is deleted.) These problems manifest themselves by not properly sending unregister messages to all targeted partitions, which may cause the InstanceState not to go to NO_WRITERS when it should and which may cause v_registration objects to leak away in the shared memory. Solution: Multi-partition Publishers now disconnect properly from all targeted partitions, which should result in the correct InstanceState for all partitions and which should remove the leaking of v_registration objects. |
OSPL-8224 |
Spinning waitset when an empty coherent set is received. A data_available notification was sent to all readers part of a transaction even when it did not have data. A waitset with a following get_datareaders (which returned 0 readers) could then enter a spin loop when no read to reset the data_available flag was done. Solution: data_available notifications are now only sent to readers which have data. |
OSPL-8233 |
Completing a pending group coherent transaction could cause a
crash after reader deletion. When a group coherent transaction becomes complete it is added to a pending list. This pending list is flushed to the readers upon call to begin_access. When a reader was deleted the begin_access still tried to flush the deleted readers part of transaction which could cause a crash. Solution: Data for the removed reader is now removed from the pending list on deletion of that reader. |
OSPL-8267 / 16494 |
DDS protobuf compiler fails when including more than one
depth-of-directory. When proto files include others that are in different directories with more than one level difference, the DDS protobuf compiler plugin would fail to generate code as the output directory for the generated code was not created recursively, causing the directory creation to fail and therefore code generation would fail alltogether. Solution: The code generator has been modified to create directories recursively in all situations. |
OSPL-8268 |
detach_all_domains operation doesn't return a failure when
something went wrong. The detach_all_domains function always return result OK. Even though no DDS call can be made afterwards, it may still prove useful to know if all was well. Solution: The function now doesn't return OK anymore in case of a detectable failure. In case it took too long to wait for threads to leave shared memory, TIMEOUT will be returned. |
OSPL-8271 |
isocpp2 target for idlpp crashes on enum in outer scope. When idlpp attempts to compile an IDL file that has an enum in the outer scope (i.e. not embedded in a module) to the isocpp2 target, it crashes with a segmentation violation. Solution: The crash has been fixed in idlpp. |
OSPL-8302 / 16505 |
The detach_all_domain operation may cause a process to crash due
to a race condition. When detach_all_domains returns successfully it has detached the shared memory segments from the process. Before that, it will deny all threads access to the shared memory segments. However, a race condition exists, which may cause a thread still to be in a critical section after the unmapping the shared memory from the process. The aforementioned thread may try to access an object in shared memory, which causes a crash of the process. Solution: The detach_all_domain operation now waits for all threads to have left this critical section before unmapping the shared memory segment(s) from the process. |
Report ID. | Description |
---|---|
OSPL-7947-1 |
Transaction sample received twice, as historical data and as
normal write, causing the transaction to never become complete When reader was created it asynchronously connected and requested historical data making it possible to receive the same sample via historical data and the normal path, this sample was counted twice which caused the transaction to never become complete. Solution: A reader now connects and requests historical data synchronously so that the sample is not received twice and the transaction now becomes complete. |
OSPL-7947-2 |
When discovering a transactionWriter implicitly is was possible
that transactions with same writer never became complete The group transaction mechanism has two ways of discovering writers one via builtin topics and an implicit discovery when data from that writer is received. In the latter situation it was possible that transaction which were depended on the same writer never became complete. Solution: When a writer is implicitly discovered check all open transactions for dependency on the writer and mark the writer as discovered so that the transaction can become complete. |
OSPL-6395 / 14540 |
Integration with Rhapsody 8.x The name of the abstraction-layer header file ("os.h") was too generic and caused a collision with a file by the same name belonging to IBM Rhapsody. To prevent this, and other potential name clashes in the future, the file was renamed to "vortex_os.h" Solution: In case the C language-binding is used, this header file is included in idlpp-generated code. This means code generated by a previous version of OpenSplice cannot be compiled using the new header file and should be regenerated first. Applications already compiled for a previous version are not affected. |
OSPL-7562 |
Durability service does not handle default partition properly in
name-space configurations The OpenSplice durability services exchange name-space information before starting alignment of historical data. If name-space contents do not match, services refuse to align each other. The algorithm that does the comparison of the expressions has a problem dealing with the so-called default partition (this is an empty string). This leads to an incorrect interpretation of ".<TOPIC>" expressions when comparing name-space contents. Solution: The flawed algorithm has been corrected to deal with the default partition properly too. |
OSPL-7601 |
DDSI2 lacks support for transient-local history settings of
KEEP_LAST with depth > 1 The DDSI2 service maintains transient-local data in accordance with the DDSI specification, but only implemented support for KEEP_LAST with depth 1. For OpenSplice this is more-or-less a non-issue as the durability service handles the history correctly anyway, but when interoperating this could be a restriction. Solution: The history maintained by DDSI now fully supports all history QoS settings. |
OSPL-7687 |
bundled throughput-example doesn't work correctly on windows The bundled throughput examples uses a flawed algorithm to determine how long an action takes. Solution: The flawed algorithm has been corrected. |
OSPL-8121 / 16218 |
Incorrect publication_handle in received SampleInfo The publication_handle in the SampleInfo object of a recieved Sample was previously being set to the same value as the instance_handle, this was incorrect. Solution: The SampleInfo::publication_handle() now returns the correct publication_handle as expected. |
OSPL-8124 |
Improve group transaction flush mechanism. The group transaction flush mechanism was inefficient as it used an unnecessary list Solution:The unnecessary list has been removed |
OSPL-8144 |
Java5 listener may trigger before entity is fully initialised. Entities created throughout the Java5 API, which have a listener attached at creation time may cause problems as the listener may trigger before the entity is fully initialised. Solution: Listener callbacks are now blocked until the entity is fully initialised |
OSPL-8149 |
Server-part of client-side durability protocol must be enabled
for all default shared memory configurations. The durability service has a feature to react to requests from clients that have an interest in historical data but are not able (or willing) to run a durability service themselves. To have a durability service react to such requests a configuration option //Opensplice/DurabilityService/ClientDurability[@enabled] exists that must be explicitly set to TRUE. This is not very user-friendly and hampers a seemless out-of-the-box experience. For that reason the client durability feature will also be enabled now when the //Opensplice/DurabilityService/ClientDurability-element is present without the [@enabled]-attribute. Furthermore, for shared memory configurations that are part of the distribution this feature is enabled by default. Solution: Durability services now will react to client requests when the //Opensplice/DurabilityService/ClientDurability-element is present without the [@enabled]-attribute. Furthermore, default shared memory configurations have been updated so that the client durability feature is enabled. |
OSPL-8151 / 16355 OSPL-8159 / 16376 |
JVM crash when creating a participant with an own set name for
the thread creating the participant. When using the Java API and the thread creating the participant has an own set name the JVM crashes. Solution: The defect is solved and the JVM will not crash anymore on an own set thread name. |
OSPL-8157 |
Coherent updates do not get delivered when using a wildcard
partition. When a Publisher is publishing coherent updates in a wildcard partition (a partition that uses a wildcard like '*' or '?'), or a Subscriber is subscribing to a wildcard partition, then coherent updates are not correctly matched between Publisher and Subscriber and the contents of the coherent update will be lost. Solution: Wildcard partition matching between Publisher and Subscribers has been improved: either the Publisher or the Subscriber can now use a wildcard partition without impacting the coherent update. However, if both Publisher and Subscriber use an unequal yet matching wildcard partition a mismatch may still take place. Fixing this scenario is left for a future update. The issue was already captured in the known issues list under OSPL-973 |
OSPL-8161 / 16377 OSPL-8071 / 16171 |
NullPointerException in the ListenerThread when using Java When using the Java API and a listener is used on an entity it can happen that the ListenerThread causes a NullPointerException when the entity is removed. The info logfile will then also show the following messages "timeout or spurious wake-up happened x times." Solution: The defect is the listener mechanism is solved and the deletion of an entity will cause no more exceptions/info messages |
OSPL-8169 |
In case of group coherent updates where more than one transaction
is involved and simultaneously both become complete by actions on
different readers a deadlock occurred. A cross locking deadlock situation involving the reader lock and group lock caused the problem, when an action performed on a reader leads to a complete transaction it will lock the group to notify about the completeness, if the group itself then also become complete and notify all other readers, this last step takes the locks in reversed order and will lead to a deadlock when two transactions become complete by two simultaneous actions on different readers. Solution: The group now releases the lock before notifying all other readers. |
Report ID. | Description |
---|---|
OSPL-5028-1 |
Launcher - Choosing a file for OSPL_URI that is not an OpenSplice
configuration file results in errors. In the Vortex OpenSplice Launcher Settings, under the Environment tab, if one were to use the file chooser to set the URI or the Lice nse field, and selected a file type that is not meant to go there, errors would be reported in the log and Launcher would otherwise not guard against any attempts to process the invalid files. Solution: For the file chooser dialogs for the URI and the License fields, an extension filter has been added to only show only .xml and .lic files respectively. |
OSPL-5028-2 |
Launcher - NullPointerExceptions get printed to console every
time a configuration is selected. Whenever a OpenSplice config is selected from the configurations table, a NullPointerException and stacktrace would get printed to the console, due to a unchecked return from the examples table selection model. Solution: The accessed to the examples selection model are properly checked for null returns. |
OSPL-7087 / 15090 |
Adding support to an unexpected stopping of the underlying DDS
middleware to the java RMI library. When the DDS middleware stops (normally or not) while a RMI server is waiting for or processing rmi requests, the RETCODE_ALREADY_DELETED error code should be handled properly. Solution: The RMI implementation shutdowns the RMI server in case it receives a RETCODE_ALREADY_DELETED that signals a stopping of the DDS middleware. This will unblock the server application thread that was running the RMI runtime to wait for the requests. Note that the DDS middleware must not be stopped before the RMI application has stopped, whatever it is at the client or the server side. Even if the RMI library has been updated to handle an expected stopping of the DDS middleware, the user application cannot continue to work well. |
OSPL-7283 / 15405 |
GettingStartedGuide is missing information about how to install
OpenSplice on UNIX ARM platform. The procedure to install OpenSplice on a UNIX ARM platform is missing from the GettingStartedGuide. Solution: The GettingStartedGuide is extended with a description of the OpenSplice installation on a UNIX ARM platform. |
OSPL-7317 |
Not possible to create reader for builtin topic DCPSType Not possible to create reader for builtin topic DCPSType, because Typesupport for DCPSType was not mapped on the kernel representation as is required for builtin types, the typesupport was not registered and the DCPSType reader was not created for the Builtin Subscriber. Solution: DCPSType typesupport is now generated correctly and registered and a DCPSType builtin reader is created. |
OSPL-7346 |
Truncation of max file size configuration values on 32-bit
platforms An issue in the configuration file processing caused truncation of the MaxFileSize RnR storage parameter on 32-bit platforms. Solution:To resolve the issue, configuration processing was improved. In addition to resolving the truncation, it now supports floating point values and the following list of units: b, B (bytes) , KiB, kB, K, k (=1024 bytes), MiB, MB, M, m (=1024KiB), GiB, GB, G, g (=1024MiB), TiB, TB, T, t (=1024GiB) |
OSPL-7618 / 15802 |
Readcondition and ContentFilterTopic could lead to a memory leak
in isocpp2 When using a Readcondition or a ContentFilterTopic in isocpp2 it is possible that they are not properly cleaned up after removing them cauing a memory leak. Solution: The leak has been fixed. |
OSPL-7627 / 15740 |
Queries/Filters had problems with literals representing fields of
type "unsigned long long" (uin64_t). Queries/Filters were not able to parse literals of type "unsigned long long" whose value were between MAX_INT64 and MAX_UINT64 correctly. Solution: The parser has been fixed. |
OSPL-7686 / 15837 |
Deleting a domain (last participant from a domain) doesn't free
all used resources correctly in single process mode. Deleting a domain (last participant from a domain) doesn't free all used resources correctly in single process mode. Solution: By setting the database size in single process mode the database will be allocated on heap and used by the memory manager in opensplice. Deleting the domain with this database results now in a correct cleanup off all used resources. Note: In previous versions the Size attribute for the Database element in the configuration had no meaning and was ignored. All available heap memory was available for the domain service. With the current implementation the database size is limited to the configured size and memory is managed by the ospl memory manager. Size 0 is default and will force the old unlimited behavior where the operating system memory manager is utilised. |
OSPL-7694 / 15847 |
Problem with string writing in the Tuner. When using the Tuner and writing a string unbounded or bounded square brackets are always added to the input. Solution: The writing mechanism is changed and no more square brackets are added. |
OSPL-7723 / 15852 |
Suppress default logs true provides null data to report plugin. When using a report plugin with the option SuppressDefaultLogs set to TRUE and the element TypedReport set in stead of Report results in null data inside the specified report log. Solution: The report plugin is adjusted so that in this scenario the reports are passed to the specific log file. |
OSPL-7743 / 15912 |
Possible missing of some async replies when their server exits. The problem may occur when a server replies to an asynchronous request and exits, then it is re-started again to process other requests whereas the client remains active. The reply instance is then disposed and as the client takes the replies, it receives an invalid sample that disturbs the replies management and makes RMI miss the following valid replies. Solution: Disabling the invalid samples for the replies datareader at the client side. |
OSPL-7762 |
DDSI declaring readers on slow machines "non-responsive". DDSI2's flow-control was extremely sensitive to the configuration of the maximum allowed amount of unacknowledged data in a writer, the relative speeds of the machines and the networks and the socket receive buffers on the subscribing nodes. Solution: DDSI2 now dynamically adapts the maximum amount of unacknowledged data, based on retransmit requests. The adaptation can be disabled for increased predictability if required. |
OSPL-7772 |
Missing samples after CATCHUP or REPLACE merge has occurred. When nodes reconnect and a CATCHUP or REPLACE merge policy has been configured, then alignment takes place. It turns out that the alignment data contains the right amount of data, but after injection of the data in the system there are less samples than expected. This phenomenon was caused due to an incorrect administration of the last dispose time which prevented that data that has been produced before the last dispose time could be inserted. In case a foreign state conflict appears and there is no aligner, rediscovery of an aligner would not lead to a merge. Solution: The last dispose time is not set when alignment for CATCHUP or REPLACE takes place, and rediscovery of aligners in case of foreign state conflicts is fixed. |
OSPL-7777 |
Launcher should display the reason why tools cannot be
started. If Java is not installed on the host machine from which Launcher is started, the tools cannot be started even though the buttons are enabled. Solution: In order to help troubleshoot incompatibilities between our tools and other Java implementations, Added another environment variable JAVA_HOME that would get picked up by Launcher from the user's environment (just like the other variables) and allow the user to specify/override their own JAVA_HOME path within the tool. Launcher will detected if a JRE and JDK are installed. When modifying the JAVA_HOME environment variable in Launcher or starting Launcher with a JAVA_HOME set to a non-officially supported Java implementation (OpenJDK, IBM Java, etc), a notification will come up indicating what Java implementation was picked up and that it is not officially supported for running the tools and building/running the examples. The tools and examples will remain enabled in this case. The tool buttons are disabled and a message is displayed that Java is not detected in JAVA_HOME. The buttons will re-enable once a valid Java install directory is specified in JAVA_HOME. |
OSPL-7814 / 15970 |
IllegalMonitorStateException when using the java API in
combination with a listener When using the Java API in combination with a listener an IllegalMonitorStateException could occur. Solution: The cause of the exception is fixed in the listener handling. |
OSPL-7816 / 15972 |
Durability service doesn't notice disconnection. When using durability in combination with RTNetworking it could happen that when the network connection is lost the durability service is not notified about this. Solution: The defect in the disconnect mechanism is fixed and durability now gets the disconnect notification. |
OSPL-7823 |
As a user, I want to be able to preview all the environment
variables. Allow the user to preview the environment variables that are available to the user through Launcher. Solution: A new Preview Environment Variables dialog is available to the user through the Preview button in the settings environment tab. The new dialog allows the user to preview the configured environment variables and copy to the clipboard. |
OSPL-7839 / 15981 |
Generation error in C# backend of idlpp When compiling an IDL file that specifies an array of enumerations, the C# backend of idlpp would generate a statement that contained a superfluous bracket, which caused a compilation error in the C# compiler. Solution: The C# backend of idlpp has been fixed by removing the superfluous bracket. |
OSPL-7852 / 15991 |
Warning during compilation in isocpp2 When compiling an application that uses isocpp2 a warning can occur in State.hpp. Solution: The warning has been fixed. |
OSPL-7854 / 15990 |
Coherent Set transaction do not proper process dispose messages. When a coherent set with a dispose message in it is used it can happen that the dispose is not proper handled causing late joiners to allign the disposed message with the status NOT_ALIVE_DISPOSE_INSTANCE_STATE where the message would be expected to be completly gone from the system. Solution: The coherent set mechanism is updated to handle dispose messages correctly. |
OSPL-7857 / 15988 OSPL-7697 / 15841 |
Memory could leak away when instances are recycled aggressively
during an overflow of the networking queue. When an instance is unregistered but quickly brought back to life (by writing/disposing a sample with the same key), there is a small chance that in case of an overflow of the network queue some memory is leaking away. Solution: The memory leak has been fixed. |
OSPL-7865 |
Enable java debug symbols To improve readability of exception stack-traces and help with debugging java applications, symbols need to be included in the jars. Solution: symbols are now included in all jar files. As a result, the average size of jar files has increased ~25%. It is possible to rebuild the Corba-Java language-binding jar using the custom-lib build process. To create a jar without symbols please change the custom-lib build script, replacing -g by -g:none. |
OSPL-7876 |
RT networking crash when using scoped discovery RT networking configurations relying on roles and scopes to restrict discovery of nodes could crash on an invalid memory reference upon discovering a new node. Solution: The issue has been corrected. |
OSPL-7877 |
Durability merge policies can cause data loss For some merge policies durability needs to dispose all (or a particular subset) of the instances in a group, but it must not use the internal disposeAll functionality, as that implements the "dispose-all" operation that has an effect into the future. Because of that effect, old data could be lost following a CATCHUP/REPLACE/DELETE merge operation, including the data being aligned. Solution: The algorithm has been modified to no longer rely on this internal disposeAll functionality. |
OSPL-7889 |
idlpp may crash when generating for the isocpp2 target idlpp crashes for the isocpp2 target when specifying an IDL enum in the outer scope (i.e. not in a module). Solution: idlpp has been fixed to correctly handle enums in the outer scope. |
OSPL-7890 |
Group coherence: Avoid the partial alignment of historical group
transactions A late joining Subscriber for group transactions may receive a group transaction partly as historical data and partly as incoming messages, this currently leads to invalid delivery of partially complete group transactions. Solution: Partly aligned group transactions now trigger full group alignment and remove all related transaction administration. |
OSPL-7895 |
Tooling entities are not cleaned up after disconnecting Entities created by tools are no longer deleted when disconnecting the tool due to an incorrect reference count of entities in the C&M API. As a result entities remained available in the federation the tools ever connected to until the federation stops. Solution: Corrected the reference count for entities in the C&M API. |
OSPL-7924 |
Add java5 and isoc++v2 DCPS examples to Launcher The titular DCPS examples were added to Vortex OpenSplice, but the Launcher tool currently does not have the capability to detect them and add them to the list of available language options. Solution: The Launcher tool has been updated to recognize the java5 and isocpp2 DCPS examples in a Vortex OpenSplice installation, and is able to execute build and run tasks for them. |
OSPL-7931 |
DDSI2 possible crash when lease expiry coincides with
termination During termination, DDSI2 would delete all proxy entities explicitly, without accounting for the possibility that leases could still expire. This could lead to freeing a proxy participant prematurely, resulting in a use-after-free when deleting the one remaining endpoint. Solution: This issue has been solved. |
OSPL-7932 |
Group coherence: detect missing EOT message and discard
associated pending transaction. As soon as a Subscriber receives a EOT message of a (group) transaction it knows for which writers it will receive EOTs, whenever an EOT or any data for other writer-transaction are missing but newer data for those writers is received then the Subscriber can conclude that the missing EOT and/or data is lost forever and subsequently discard the whole group transaction. Solution: Received messages of group coherent updates that will not become complete are now discarded. |
OSPL-7933 |
Group coherence: Subscriber-side re-evaluation of completed
transactions to assure data meets latest user expectation. An initializing Subscriber can receive transactions that match the Subscribers readers before all readers are created, meaning that data belonging to the reader to be created is not considered and possibly not delivered. Solution: Completeness evaluation of ongoing group transactions now consider creation of additional DataReaders that may affect completeness. |
OSPL-7938 |
In situations where the builtin topics do not have to be aligned,
they where still being aligned in case the AutoBuiltinTopics
namespace is generated When no namespace for the builtin topics is configured, a namespace called 'AutoBuiltinTopics' is created automatically for the builtin topics. When no builtin topic aligned is needed (e.g., when DDSI takes care of the builtin topics) it could still happen that the builtin topics were being aligned, even though they shouldn't. Solution: Alignment of builtin topics is prevented for the AutoBuiltinTopics namespace in case no alignment is needed for them. |
OSPL-7941 |
Resource leak in application after closing domain in shared
memory mode. If a domain is closed (last participant in an application for that domain is deleted) not all used resources were freed correctly in shared memory mode. Solution: Used resources are freed now for the closed domain. |
OSPL-7946 |
Group coherence: ddsi crash during connection changed A crash may occur in the DDSI2 service when a connection change happens during a coherent update by an application. Solution: Crash during connection change has been fixed. |
OSPL-7947 |
Group coherence: complete transactions sometimes do not get delivered Depending on timing, transactions, although complete, would not get delivered to application readers. Solution: Several bugs concerning race-conditions have been solved. |
OSPL-7948 / 16142 |
Limitation on supported network interfaces. When using DDSi with more than 32 network interfaces it is possible that the requested network interface is not found and DDSi will report an error. Solution: The number of supported network interfaces is increased to 128. |
OSPL-7973 / 16144 |
Java 5 QosProvider error reporting improvement. When using the QosProvider in Java 5 all errors with it resulted in a null pointer exception with no proper error message. Solution: The error reporting in Java 5 for the QosProvider has been improved. Each error now results in the proper java exception with a use full error message. |
OSPL-7974 / 16145 |
Waitset events potentially leak away when received after the
Waitset's timeout value. If a WaitSet times out, and new events arrive before the unblocked Waitset thread gets the time to execute and report the timeout, the new events will be lost and their memory will leak away. The chances of this happening increase when a Waitset frequently times out while its thread has to compete for CPU time with one or more other threads that are generating events for that same Waitset. Solution: A thread that is unblocked by a Waitset because of a timeout, will first check for pending events prior to reporting the timeout. If pending events exist, each event is processed accordingly and the timeout is is not reported. If no pending events exist, the waitset will report the fact that it timed out. |
OSPL-7978 |
DDSI connection fail-over when using Vortex Cloud routing
service. When DDSI is relying on the Vortex Cloud services, it should switch from one routing service instance to another in case of network or routing service failure. The Vortex Cloud discovery service provides its clients with new addresses to use in such situations, but the DDSI2 service would not actually switch to the new addresses. Solution: The DDSI2 has been modified to always use the most recent addressing information provided by the discovery service. |
OSPL-8004 |
Shared memory leak in set_qos operations. The OpenSplice kernel failed to release memory allocated in the process of changing the QoS of an entity, resulting in a memory leak each time a set_qos operation was used. Solution: The memory is now freed. |
OSPL-8015 |
Tuner export data throws exception. When attempting to export data using the Tuner, an exception is raised. This is caused by a change in the UI, where the QoS-ses to use are presented in a different manner. Solution: Updated internal algorithm to cope with the change in the UI to obtain QoS settings. |
OSPL-8021-1 |
DDSI2 crash in nn_address_to_loc when accepting TCP connection. The DDSI2 service could crash on accepting a TCP connection if requesting the peer address after accepting the connection failed, typically caused by the connection already having been closed. Solution: DDSI now handles this error condition correctly. |
OSPL-8021-2 |
Networking Bridge not enabling forwarding for a topic until new entities created/destroyed. The networking bridge processes the built-in topics describing the existing entities to determine what to forward. There existed cases in which the networking bridge would have been able to enable forwarding of data but instead would pause its discovery process until the arrival of new built-in topics. Solution: The networking bridge now always fully processes the available built-in topic samples before waiting for new ones to arrive. |
OSPL-8021-3 |
DDSI2 spamming the log with "malformed packet" warnings when
interoperating with RTI Connext. RTI Connext sends many packets that are not well-formed RTPS packets, and these lead to "malformed packet" warnings from DDSI2. It appears that recent changes to RTI Connext have caused a significant increase in the number of warnings, leading to huge log files with little value to users. Solution: DDSI2 now no longer logs the "malformed packet" warnings for commonly encountered packets, except when run in "pedantic" mode. |
OSPL-8021-4 |
Phantom entities in OpenSplice with DDSI2 and the Networking
Bridge when interoperating with other vendors' products OpenSplice internally operates using globally unique identifiers that antedate the DDSI specification, and hence DDSI2 translates between the identifiers as used in the DDSI specification and those used in OpenSplice. In combination with the Networking Bridge, the interaction could lead to the creation of "phantom" readers and writers as a consequence of an incomplete filter on the translated identifiers. Solution: The filtering now accounts for interoperating with other products. |
OSPL-8021-5 |
Networking Bridge discovery hang on topics used by other vendors'
implementations The Networking Bridge requires complete topic definitions to be available, but the DDSI specification does not define interoperable type definitions. Therefore, the bridge can encounter readers and writers for topics that do not exist in OpenSplice. Solution: The networking bridge now skips endpoints that use undefined topics; if and when a topic becomes available, the endpoints are re-evaluated. |
OSPL-8029 |
When a client requests historical data using client-durability
that matches multiple partition/topic combinations, multiple
responses are generated. One of the ways to obtain is historical data is using the client-durability feature. When a client requests data that matches multiple partition/topic combinations the server responded with multiple data sets (one for each partition/topic combination). The intended behavior is that a single response is generated that contains the aggregated data from all requested partition/topic combinations. Solution: The problem is fixed and now a single response is generated that contains all requested data. |
OSPL-8032 |
Mmstat can report more memory available than configured Mmstat determines the amount of available memory through a fairly involved calculation, in which an mistake was introduced, causing it to potentially report more available memory than was configured. Solution: The calculation has been corrected. |
OSPL-8034 |
DDSI memory leak caused by a race condition between discovery and
termination DDSI discovery runs in a separate thread that takes its input from the network. During termination, the network input was stopped, but the discovery thread could still be processing a participant discovery message. In this case, deleting all proxy participants could occur too soon, leaking a proxy participant. Solution: Processing DDSI discovery data is now forced to complete in time. |
OSPL-8055 |
Group coherence: transaction does not become complete In some scenarios transactions would not become complete and therefore not delivered due to an incorrect algorithm in the matching of writers and readers. Solution: Several scenarios that resulted in incorrect dropping group coherent updates are now solved. |
OSPL-8058 |
Group coherence: memory leakage The introduction of the group coherency feature introduces memory leaks in some scenarios, also in some situations where group coherence is not even used by any application. Solution: Several memory leaks concerning group coherent updates have been fixed. |
OSPL-8072 |
Maximum Domain ID value (230) is not enforced in the API or
during startup of the domain. Maximum Domain ID value (230) is not enforced in the API or during startup of the domain Solution: The domain will refuse to start when the domain ID in the configuration file is out or range (larger then 230). Trying to pass an invalid domain ID in a function call will result in an error in the info log and the call to fail |
OSPL-8091 / 16180 |
Coherent Set transaction leaks v_transactionPublisher object Each time when a coherent set publisher is created a v_transactionPublisher object will leak. Solution: The coherent set mechanism is updated and the object is now properly freed. |
OSPL-8095 / 16182 |
Durability crash when inserting out of order disposed messages When durability is used to insert out of order disposed messages the service could crash. Solution: The defect is solved and durability will not crash anymore on out of order disposed messages. |
OSPL-8123 |
The persistent store retaining the wrong samples When committing a transaction on a KEEP_LAST history where the transaction contained more samples for a single instance than the depth of the history, the (KV) persistent store would not always retain the N latest samples. Solution: The persistent store has been fixed to retain the latest samples also in this case. |
TSTTOOL-265 / 15462 |
Implement built in script variables for current scenario/macro
filename and path. The Tester scripting engine has facilities for calling on built in variables that can be referenced from any script execution. Two new variables that hold the value for the currently executing script file name and and the file path are needed. Solution:The new built in variables have been added to the Tester scripting engine as variable names "script_file" and "script_path". See Tester user guide section 6.1.2 for update. |
TSTTOOL-343 |
Tester's statistics browser can't display statistics
information. Navigating to the statistics tab and attempting to view statistics information for DDS entities does not currently work. The Tester log file reports that the entities are not available. Solution: The CM objects that the statistics workers held to gather statistics from were being freed too early. The unintentional free has been fixed and the statistics view works again. |
Report ID. | Description |
---|---|
OSPL-6050 / 14407 |
DDSI2 MaxMessageSize and FragmentSize are no longer considered
Internal options DDSI2 MaxMessageSize and FragmentSize are no longer considered Internal options and should therefore be moved from the Internal to the General section. Solution: The DDSI2Service/Internal/MaxMessageSize and DDSI2Service/Internal/FragmentSize options have now been moved to DDSI2Service/General/MaxMessageSize and DDSI2Service/General/FragmentSize. The old setting is still supported as a deprecated setting and causes a warning when used. |
OSPL-6467 |
When a client sends a historical data request to the
client-durability server, the server did not respect the timeout
that was specified in the reques
When a client sends a historical data request to the client-durability server, the client can specify a timeout value to indicate the time that the server may take to answer the request. Up to now the server did not respect the timeout value and would always answer immediately. With this fix the server now respects the timeout value. Solution: The server now queues requests and answers them based on their timeout value. |
OSPL-6983 |
When a client sends a request for historical data to the
client-durability server, the server must ensure that the client's
historical data reader is discovered.
One way for a client to obtain historical data is by publishing a request for historical data, and waiting for the response from the client-durability server. In case the server uses ddsi as the networking service, ddsi first must match readers and writers before communication can take place. If the server sends a response to a request, but ddsi has not yet discovered the reader to deliver the response to, then ddsi will drop the request. To prevent this from happening the server must either ensure that the reader has been discovered before sending the response, or send back a error indicating that no reader was discovered in time. Solution: The server know contains functionality to detect whether the reader of a client has been discovered. |
OSPL-7348 |
userClock is missing from the configurator. userClockService is not a service but an option for a domain in the configuration. The tag userClockService is confusing. Solution: The configuration tag userClockService is now changed to userClock. The old tag userClockService is still supported for backwards compatibility, but will result in a warning in the info log that a deprecated tag is used. |
OSPL-7716 / 15848 OSPL-7708 / 15846 |
Sometimes a RETCODE_ERROR was reported during the processing of a
dispose_all_data request. When the durability service contained a sample with a newer timestamp than the timestamp for the dispose_all_data request, it would decide to exclude the newer sample from the dispose request. Its return status would clearly communicate that decision, but this was wrongfully interpreted as an unspecified error. Solution: The interpretation of the return status of the durability service has been corrected, and will no longer consider the exclusion of samples newer than the dispose_all_data request as an unspecified error. |
OSPL-7733 |
Premature deletion of a writer may cause that DDSI drops data
when client-durability is used. When a client sends a historical data request to a server, the server is expected to deliver the response in the partition specified in the request. For that reason the server creates a writer, publishes the data, and destroys the writer again. If ddsi has not yet taken the data before the writer is destroyed then the data will not be delivered. To prevent this situation premature deletion of the writer must be prevented. Solution: The writer is cached for some time before it is being deleted. This gives ddsi sufficient time to take the data. |
OSPL-7775 |
ISOCPPv2 union generation from idl can fail. The generation of an IsoCpp2 union from an idl will fail if a) the union has char as switch type b) it has a default case and c) it is build on a platform on which char is unsigned by default and signed-char compiler flag is not used. The result can be a wrongly initialized union class, an ncompilable generated header file or an idlpp crash. All these problems were caused by the fact that idlpp expected the minimum and maximum values of a char to be -128 and 127 respectively. This is untrue when the char is unsigned. Solution: Maximum and minimum values of a char are now dependent on its default sign-ness. |
OSPL-7782 |
Memory leak using Java5 DDS API DataReader.Selector. Whenever a DataState or Content filter was applied to a Selector, an underlying ReadCondition got created. The ReadCondition had native resources associated with it and would only be freed when the DataReader associated with the Selector got deleted. This could lead to a serious increase of memory usage and eventually result in an OutOfMemoryError in case new Selectors were allocated frequently i.c.w. setting a DataState or Content filter. Solution: In case of just a DataState, no ReadCondition is allocated. This also improves performance of the Selector. Additionally, the finalizer of the Selector frees native resources associated with a ReadCondition now (only applicable if a Content filter is applied). Finally, the Selector implementation has been made thread-safe as well by returning a new Selector object every time a setter method is called. |
OSPL-7785 / 15959 |
JVM crash during deletion of data reader or data writer. A mistake in the reference counting of basic types in the code constructing samples of the built-in topics for the user_data, group_data, topic_data and partition name settings could eventually cause the freeing of a basic type that is supposed to remain in existence during the operation of the domain. Use of this type in subsequent operations could then cause a crash. The issue could only occur when these QoS are set to a non-empty value. Solution: The reference counting for these cases has been corrected. Note that the issue is not limited to Java and that the crash can also occur in situations other than deleting a data reader or writer. |
OSPL-7794 |
ISOCPP2 dds::core::Time::operator > and < are broken The > and < operators on dds::core::Time in the ISOCPP2 API are not working properly due to an incorrect algorithm for comparing the nanoseconds part. Solution: The internal algorithm has been fixed to make the operators work properly again. |
OSPL-7808 |
ISOCPPv2 domain_id function returned default_id The isocpp2 'dds::domain::DomainParticipant::domain_id()' function returned 'org::opensplice::domain::default_id()' when a the DomainParticipant was created with default_id. The DomainParticipant delagate kept a copy of the domain_id which was used during creation iso requesting the actual domain_id from the underlying core. Solution: The 'dds::domain::DomainParticipant::domain_id()' function now gets the actual domain_id from the underlying core. |
OSPL-7822 |
Possible crash due to double free in listener. The listener has a thread that waits for events and dispatches them. After dispatching, the events are freed. Events are also freed before dispatching when the source entity was destroyed in the meantime. It was possible that these events of the destroyed entity were still be handled and thus freed a second time. Solution: Fixed the wait for events loop. |
Report ID. | Description |
---|---|
OSPL-7761 / 15836 |
Illegal time is reported repeatedly when the native networking service is used. On some Windows platforms the native networking service does not serialize the source timestamp of the messages correctly, which causes the receiver to read an incorrect timestamp and will report the illegal time error. This is caused by a code construction i.c.w. an optimalisation made by a compiler in the order of parsing the second and nanosecond part of a messaged. This causes the seconds and nanoseconds to be sent on the wire in the wrong order. Solution: The code that serializes the source timestamp is changed to ensure that they are parsed in the correct order. |
Report ID. | Description |
---|---|
OSPL-6481 |
Launcher can not load User defined Files. When a user creates their own deployment xml file (e.g %OSPL_HOME%\etc\config\My_ospl.xml), Launcher can not load it and can therefore not apply it. Even if you stop the Launcher and you restart it, it does not see the new xml file. Solution: Updates to the user-specified configurations field in the directory setting panel, to the OSPL_URI field in the environment settings panel will trigger a refresh of the configurations list in teh Configurations page. Added a new Refresh button on the Configurations page. If the Refresh button is pressed, then the configurations list is rebuilt using OSPL_URI, user specified configurations directory, OSPL_HOME\etc\config directory. Duplicates are removed if they exist in more than one of these locations.Notifications added to warn users if a configuration no longer exists when trying to edit it or set it as default configuration. |
OSPL-7664 |
Memory leakage when using group coherence with volatile topics. Every group coherent transaction leaked memory as the EOT message, as opposed to other transaction message, created a new group owned transactionGroup which never became complete and was therefor never removed. Solution: Prevent EOT messages from creating group owned transactionGroups for volatile topics. |
OSPL-7700 / OSPL-7714 / 15845 |
Group coherence data possibly wrong during discovery. When group coherent data was written and not all writers had been discovered yet via builtinTopics it was possible data wrong data was flushed or data was never flushed as the mechanism for determining the completeness of a group was flawed. Solution: The mechanism for calculating completeness has been reworked so correct completeness can be determined. |
OSPL-7728 / 15854 |
Memory leakage on Waitset time-out. When monitoring shared memory using mmstat, it could be seen that v_proxy objects were leaking away every time a WaitSet timed out. Solution: The leak has been solved. |
OSPL-7737 / 15858 |
XML parser does not allow reference to DTD. The new XML parser introduced in V6.6.0p1 did not allow references or attributes that started with a '!' so that references to the DTD like <!DOCTYPE ...> would result in a validation error. Solution: Tag names and attribute names that start with '!' will no longer automatically result in validation errors. |
Report ID. | Description |
---|---|
OSPL-5834 |
Host side binaries included in target RTS installers Host side RLM binaries, rlm, rlmutil and pristmech were being included in target RTS installers unnecessarily. Solution: RLM binaries, rlm, rlmutil and ADLINK are no longer included in target RTS installers. |
OSPL-7082 / 14835 |
Tuner tool hangs when reading large sequences or arrays. The tuner tool can take a a few minutes when reading samples that contain large sequences or arrays. It is not responsive during that time. Solution: The performance of C&M (which the Tuner tool uses) is improved considerably by using StringBuilder instead of Strings where keys an values are concatenated and by adding happy paths to searches. |
OSPL-7641 |
Group coherence not working when used during discovery. When a group coherent writer starts publishing data before/during creation of the group coherent reader it was possible that no group coherent data was received during the livecycle of the reader. The transaction mechanism had an invalid (too high) writer count, because the discovery based on PublicationInfo and EOT message both increased the count for the same writer, the count is used determine if a transaction is complete and can be delivered to the reader. Because the count was too high the transaction never became complete and thus the transaction was never delivered to the reader. Solution: Updated the internal administration so that when a writer is discovered via PublicationInfo it removes it from a list so that the EOT message cannot add the same writer. |
OSPL-7646 / 15812 |
Issue with register_service in Java RMI with duplicate services. The register_service call of the Java binding of RMI did not properly check for duplicate services. When a particular service-name is already registered, the call would return true without considering the instance-id and class parameters. Solution: The behaviour was fixed by returning false when a service is registered with a name that already exists, but a different instance-id or class. |
OSPL-7661 / 15813 |
Unregistering of report plugin not done sufficiently. Participant creation failed after 10 registered report plugins due to incomplete cleanup on unregistrations and a hard limit of 10 plugins. Solution: Limits on number of report plugins are removed and unregistering report plugin has been improved. |
OSPL-7673 / 15816 |
Compiler warnings caused by idlpp-generated cpp code. C++ code generated by idlpp leads to warnings when compiling the code. Solution: Improvements to idlpp templates and code-generation (to remove unnecessary casts, among others) can result in compiler warnings regarding signedness of comparison operands. The signedness of length variables in generated code for sequence-of-sequences was changed to an unsigned type so the warning does not occur. |
Report ID. | Description |
---|---|
OSPL-7610 |
Durability crash when fellow running pre-V6.6 version is present The durability service contains a mechanism to determine compatibility with other durability services, which could be other versions with a different set of features. A flaw in this mechanism causes the durability service to crash when it receives a sample-request from a durability service with a pre-V6.6.0 version. Solution: The mechanism was improved to be more robust |
OSPL-7557 |
SOAP service not allowing connections in SP-mode Connecting any tool to a SOAP service, that is running as part of a single-process DDS application, fails with the error report that a participant could be created. This is due to a change in the SOAP service where it passed on an empty domain URI internally to prevent an extra unnecessary parse of the configuration file. Solution: The SOAP service now always passes on the full URI and domain id in all cases. |
OSPL-7564 |
Change default installdir/windows start menu for Vortex V2 For Vortex_v2 the directory/start menu structure must change. All Vortex products should now install into the same structure. The version number is the version of that product and has no leading "v". For OpenSplice the structure must be: ADLINK/Vortex_v2/Device/VortexOpenSplice/<version> Solution: The new structure has been applied. |
OSPL-7565 |
Durability crash when more than 2 roles present For each name space, durability maintains information about the various roles it has merged with. The way this set is represented could cause a crash when more than 2 roles were used in the system. Solution: The representation has been fixed. |
OSPL-7616 |
Interoperability problem with RTI Connext 5.2.0 DDSI2 is quite strict in its checking of the values it receives in the discovery messages, which can from time-to-time result in interoperability problems. With Connext 5.2.0, RTI appears to have appropriated part of OMG-reserved namespace for a new extension in the discovery data, DDSI2 flags it as invalid and discovery fails completely. Solution: DDSI2 now accepts unrecognised values unless the "pedantic" mode for StandardsCoformance has been selected. |
OSPL-7630 |
Coherent transaction shared memory leakage For every coherent transaction a kernelModuleI::v_transactionPublisher and child objects leaked in shared memory. Solution: Memory is now freed. |
OSPL-7631 |
Possible deadlock when using coherent transaction When an end-of-transaction (EOT) message was the only message in the resend list the 'end_coherent_changes' function could deadlock. The resending of an EOT message did cause 'end_coherent_changes' to re-evaluate its conditions. Solution: Resending EOT messages now causes 'end_coherent_changes' to reevaluate it's conditions. |
Report ID. | Description |
---|---|
OSPL-139 |
Retention-period for purging unregistered instances is too long and non-configurable Currently theres a fixed 5 seconds 'retention period' after which unregistered instances are actually deleted/memory-freed. The 'artefact' that this retention period prevents is the unwanted resurrection of unregistered-instances in case of out-of-order reception of network-traffic. The current value of 5 seconds is so long that its not hard at all to run out of the default 10 MB shared-memory segment if you create/delete instances in a rapid pace. Solution: A new option RetentionPeriod is added to the domain configuration (OpenSplice/Domain/RetentionPeriod). This option specifies how long the administration for unregistered instances is retained in both readers and the durability service before it is definitively removed. The default value is 500 ms |
OSPL-5395 / 13742 |
The dispose_all_data() (resp. on_all_data_disposed()) method is not supported under ISOCPP The dispose_all_data() (resp. on_all_data_disposed()) method is not supported in the ISOCPP, but must be supported in its successor ISOCPP2. Solution: The ISOCPP2 API now supports this feature. |
OSPL-6338 |
ISOCPP streams API may drops samples silently when flush of stream fails. The flush operation of the isocpp streams API will perform a write operation of the underlying datawriter. The write operation on this datawriter may return a timeout because of expiry of the reliability.max_blocking_time associated with this datawriter, which is default set to 100 ms. For example this may occur because of network congestion. In that case the sample are silently dropped. Solution: When the flush operation on the stream fails, because the underlying datawriter write operation returns a timeout then a timeout exception will be thrown. Also the append operation will throw a timeout exception when the append results in a flush operation that times out. The samples will not be dropped and will be sent when the application retries the flush operation. |
OSPL-6436 / 14446 |
Errors in generated Modeller code. C++ code generated for IDL where the same module scope was repeatedly opened and closed would be ordered such that all of the code for each module would be grouped together. This could lead to the generated code being invalid due to datatypes being used before they are defined. IDL generated by the Modeller application could be sensitive to this issue. Solution: C++ code is generated with a module scope structure corresponding to the IDL source. |
OSPL-6808 |
Java5 DDS API lacks support for proprietary DomainParticipant operations. The Java 5 API was missing support for the proprietary delete_historical_data() and create_persistent_snapshot() operations on a DomainParticipant for the Java 5 DDS API. Solution: The 2 missing operations have been added to the Java5 API. |
OSPL-6943 |
1st and 3rd implementation of dds::domain::discover function potentially clash.s When invoking the 1st implementation of dds::domain::discover with 2 parameters (so time-out becomes default parameter), the function clashes with its 3rd overloaded implementation, which also has 2 parameters and which is preferred by the compiler. Solution: 3rd implementation of dds::domain::discover has been renamed to dds::domain::discover_all. |
OSPL-6954 / 14900 |
The concurrent handling of a terminate signal and cleanup
performed by the application may cause a crash. When a termination signal is handled, which tries to detach the application entities from shared memory, and concurrently the application is performing a cleanup of the DDS entities a crash may occur when the signal handler tries to access memory that is already freed. Solution: A reference count is added to the objects that can be freed concurrently when the application is detaching from shared memory. |
OSPL-7035 / 15067 |
NullPointerException when using Java RMI API during shutdown. When RMI is used and a request is done during shutdown of the application it is possible that a NullPointerException can occur in the handleRequest call. Solution: The handleRequest function is adjusted so the NullPointerException will not occur anymore. |
OSPL-7095 |
Reader snapshot in Tuner returned incomplete history and provided
unreliable sample info. The reader snapshot feature from the tuner did not present reliable sample info, in particular the sample, view and instance states could not be trusted. Moreover, the snapshot only showed a single sample per instance, rather than the full history in case of KEEP_ALL or KEEP_LAST n with n > 1. Solution: These issues have been addressed. |
OSPL-7101 |
Handling of leading/trailing white-space in configuration file. In the OpenSplice XML configuration-file, leading and trailing whitespace in configuration directives was not handled consistently. The raw value was stored, leaving the possibility for different services to interpret it in different ways. Solution: Whitespace is now trimmed from all configuration elements (not attributes) to get a consistent behavior, even though it doesn't strictly adhere to the XML specification, it is what most users expect. |
OSPL-7109 |
The default value of the durability StoreSleepTime is 2.0, it
should be 0.0. In case data needs to be persisted, the StoreSleepTime and StoreSessionTime control the rate at which data is stored. In particular, the StoreSleepTime is intended to prevent that the thread that is responsible for persisting data eats up too much CPU time. The default used to be 2.0 seconds. This may potentially cause unnecessary delays. Since in many use cases persisting data will not cause a resourcing problem, it makes more sense to use a default of 0.0 seconds instead 2.0 seconds. Only if it turns out that the persistency thread takes up too much CPU time a non-zero value should be used. Solution: The default value for StoreSleepTime is set to 0.0 seconds. |
OSPL-7130 / 15176 |
Problem with listener notification at startup in ISOCPP API When creating an entity with a listener and events that should be notified to the listener occur immediately, they may not when using the ISOCPP DDS API. Solution: The ISOCPP DDS API is now deprecated and has been replaced by the ISOCPPv2 DDS API that does not suffer from the problem. Users who suffer from the issue should migrate to the new ISOCPPv2 DDS API. |
OSPL-7198 / 15368 |
Enum c_collKind elements are named too generic. The elements of c_collKind are named quit generic (f.i. C_LIST). This can conflict with customers' variable naming or their 3rd party libraries. Solution: The elements have been renamed by prefixing them with the OSPL_ prefix. |
OSPL-7202 / 15365 |
If a write with timestamp of a new sample is done after an
unregister of the instance but with a timestamp before the
unregister, this new sample is not received the reader Due to the unregister the communication path between the writer and reader(s) was destroyed, causing the write of the new sample being discarded. Solution: In case a write with timestamp is done after an unregister with a timestamp before the unregister, bypass the normal communication path and write directly to the reader(s). |
OSPL-7210 / 15373 |
Defining multiple protobuf messages in a single proto file fails
for isocpp2 For each protobuf message in a proto file that is annotated to be used in DDS, the proto compiler back-end generates the required underlying trait when using the isocpp2 API. In case of multiple 'DDS-enabled' message structures, the generated traits end up in the same file, which is ok, but the surrounding ifdef's are the same for each traits. This leads to exclusion of all generated traits except the first, which on its turn leads to an invalid argument error when trying to use the trait at run-time. Solution: In case of multiple traits the ifdefs are only done once and surrounds the complete set of traits. |
OSPL-7228 / 15377 |
Creating Topics using ISOCPP2 with non-topic types results in cryptic error reports With the ISOCPP2 DDS API, templates and traits are used to deal with type-specific structures. This requires a pre-processing step that generates traits for data-structures that need to be published/subscribed in DDS. Whenever an attempt is made to create a Topic for a type without a trait, the code still compiles due to the fact a template exists. At run-time though, the creation fails due to the missing trait and this results in an exception being thrown (Note: the missing trait can also be caused by not including the correct header file in your application. The "<type>_DCPS.hpp" is the correct one to include). The fact that an exception is thrown is correct, but the corresponding error message is very cryptic and needs to be improved. Solution: The error message has been modified to "Topic type without traits detected. This can happen by using a non-topic type or including the wrong header file." |
OSPL-7234 / 15376 OSPL-7336 / 15432 |
The durability service may crash when not all namespace from a
fellow are received and another fellow disconnect. The role of a fellow is set when all namespaces of that fellow are received. When another fellow is disconnected the namespace administration of all fellows is checked to see if the disconnected fellow was not an aligner. A crash occurs when the administration contains a fellow for which the role is not set. Solution: When the first namespace of fellow is received the role of that fellow is also set. |
OSPL-7250 |
Unable to terminate StreamDataReader applications. The get() method on a StreamDataReader can cause applications to block for a significant amount of time if a large timeout is supplied and no data is being delivered to the reader. Solution:A new method named interrupt() has been added to the StreamDataReader. Calling this method causes any threads blocking on get() to unblock immediately and return control to the application. For an example of using this method in a termination handler, please see the Streams Throughput example. |
OSPL-7256 |
Waitsets on Java API (classic and Java 5) would sometimes not unblock when the middleware is shutdown. Due to an error in handling the list of conditions, the check done to detach from a single domain might come up empty, causing the waitset to remain blocked. This was particularly likely on 64-bit platforms. Solution: The list of conditions is now passed correctly, causing the unblock to properly work. |
OSPL-7263 |
Isocpp2 can generate garbled information in exceptions. A few calls to isocpp2 would produce somewhat unclear reports and/or exceptions when they fail. Solution: The creation of exceptions and report stacks has been improved. |
OSPL-7284 / 15404 |
Unexpected and incomplete native log report. Some logs are written in the OSPL native report system in spite of the configuration SuppressDefaultLogs=True. Also, the information in the native log and report plugin are incomplete. Solution: The tracing functionality is improved regarding checking of SuppressDefaultLogs and regarding traces copying when SuppressDefaultLogs=True. |
OSPL-7288 / 15399 |
String <NULL> given as value to report plugins When the reporting functionality fails, it can generate "<NULL>" for certain information within the report. Reports are provided to report plugins by means of XML. This will clash, because the value "<NULL>" can now be seen as an XML tag. Solution:Generate "NULL" instead of "<NULL>" in a report functionality error situation. |
OSPL-7293 |
Crash in RnR service during remove-replay command containing transformations While processing a remove-replay command containing transformations, the RnR service could potentially crash due to memory corruption. This does not occur when replay is stopped by other means, i.e. by stopping a scenario or reaching end-of-storage. Solution: The issue was caused by a double free and fixed by improving the transformation cleanup. |
OSPL-7303 |
The secure networking service cannot find the security element in the configuration file. The secure networking service reads the security settings from the Security element in the configuration file. The XPath expression used to find the Security settings is incorrect. It tries to find the Security element under the NetworkService element instead of the SNetworkService element. Solution: First try to find the Security element under the SNetworkService element. When the Security element cannot be found under the SNetworkService element then the Security element is searched under the NetworkService element in order to be backward compatilble, in that case a deprecated warning is logged. |
OSPL-7313 / 15414 |
DDSI2 ExternalNetworkAddress error in a multicast configuration. The presence of the DDSI2 ExternalNetworkAddress option in a multicast configuration was deemed an error and DDSI2 would terminate during startup. Solution: Reduce the presence of DDSI2 ExternalNetworkAddress in a multicast configuration to a warning and ignore its value. |
OSPL-7318 / 15416 |
The durability service configuration parameter maxWaitCount is parsed incorrectly. The durability service configuration parameter maxWaitCount is not correctly interpreted. This may cause that the check performed by the durability service for the attachment of the networking services to a particular group may timeout to early. When that occurs the group involved is ignored. When this particular durability service has been selected as master for that group it will not align this group to the other nodes. Solution: Calculate the correct timeout value from the configured maxWaitCount. |
OSPL-7337 / 15433 |
DDSI configuration with SSM allowed and ASM disallowed causes attempts at sending to 0.0.0.0 A DDSI configuration that enables SSM but disabled ASM should (initially) send participant discovery (SPDP) packets to the explicitly configured peers only. Internally this is realised by setting the SPDP multicast address to the unspecified address (:: or 0.0.0.0), but enabling any form of multicasting caused the unspecified SPDP address to no longer be recognised as such, and hence to be considered a required destination for SPDP packets. This in turn caused the log to fill up with error messages, but was otherwise harmless. Solution: The error in the processing of the addresses has been fixed. |
OSPL-7338 |
Report 'no append' feature improvements. When using the <Report append="false"/> configuration in shared memory mode on Linux, only the traces of the last started process will be present in the log files. Also, the log files remained untouched until a process opened the log files. This means that the error log was sometimes not in sync with the info log file. Lastly, the OSPL_LOGAPPEND environment variable was not handled correctly. Solution: Delete stale log files as soon as OSPL_LOGAPPEND=false or <Report append="false"/> is detected. Only the first process (spliced or single process) is allowed to do the deletion. |
OSPL-7341 |
A durability configuration with aligner set to false may miss data on reconnect. A federation with a durability service which has the configuration aliger set to false (alignee) didn't receive data published during a disconnect after a reconnect when the master durability service does not detect a master conflict. The alignee detected a master conflict and assumed the master would raise its state while the master saw no conflict and has no reason to raise its state. Solution: The alignee now requests the latest state from the master when it recovers from a disconnect during which it had no master. |
OSPL-7368 / 15476 |
For the ISO C++ mapping the idlpp compiler generates incorrect code for a typedef of an array. When the idl specification contains a typedef of an array then the code generated by the idlpp compiler for the ISO C++ mapping is incorrect. In that case it partly generates typedefs for the array alias which are usually generated for the C++98 mapping. Solution: A condition is added to the idlpp compiler when generating code for a typedef of an array which checks if the code has to be generated for the ISO C++ or the C++98 mapping. |
OSPL-7376 / 15469 |
Idlpp generates sequence alloc functions twice when compiling
different idl modules which contain a definition of a sequence of a
type specified in another module. When different idl modules specify a typedef of sequence of a type which is defined in another module or the specify a typedef of a sequence of a basic idl type then the corresponding alloc and allocbuf functions are generated twice by the idlpp preprocessor. For example when two modules each specify "typedef sequence<long> LongSeq" then the corresponding DDS_sequence_DDS_long__alloc functions are generated twice. Solution: For sequence definitions the generate sequence alloc and allocbuf are prefixed with the scope name of the module in which they are defined. |
OSPL-7445 |
Cannot build ISOCPP custom lib on Windows On Windows the custom_lib for isocpp would fail to build due to an import/export conflict with the classic C++ API on which it was built. Solution: That conflict has now been resolved. |
OSPL-7469 |
Java5 DCPS API TopicQos.withPolicy() methods don't copy policies from original TopicQos The witPolicy() methods on the TopicQos are meant to return a copy of the original TopicQos with only the policies supplied as arguments to be overridden compared to the original TopicQos. The implementation returned the default TopicQos with the supplied policies overridden per supplied argument(s), but not using the original TopicQos as source meaning that non-overridden policies would be default instead of the value in the original TopicQos. Solution: The algorithm has been modified to use the original TopicQos as source instead of the default TopicQos. |
OSPL-7480 / 15640 |
ISOCPP C++11 should use defaulted functions. Since C++11, defaulted functions are introduced. But older Visual Studio versions did not support it completely. Constructors and assignment operators are generated by idlpp for these compilers. However, the compiler detection regarding this issue was not correct, resulting in all non-VS C++11 compilers using the generated functions. Solution: Compiler detection is improved so that all C++11 compilers that support defaulted functions actually use them. |
OSPL-7527 |
Invalid handle error in isocpp2 when getting an status from an entity When requesting an status from an entity which needs an instance handle and the status has not yet occurred results in an invalid handle error on isocpp2. Invocations of the offered/requested_deadline_missed() functions on Writer and Reader in the ISOCPP2 API would crash when no such event ever occurred before the invocation. Solution: That crash has now been fixed. |
OSPL-7528 / 15653 |
TimeoutException in ISOCPP2 should not be recorded in ospl-error.log The TimeoutException in ISOCPP2 should bot be recorded in the ospl-error.log, since it might result in massive amounts of undesired log messages in normal scenario's. Solution: When a TimeoutError occured, it would show up in the ospl-error.log file causing potentially lots of undesired messages in there since a timeout scenario can be totally valid and is by no means an inherent error. For that reason any TimeoutError is no longer logged into the ospl-error.log file. |
OSPL-7533 |
Alignment of durability service with aligner=false configuration
by late joining master with persistent data intermittently fails When a late joining durability service with aligner=true configuration (master) joins a system with running durability services with aligner=false configuration (alignee) it could happen that the alignee nodes did not get alignment data from the master node. The alignee node detected a master conflict when the master node arrived and when it tried to resolve the conflict this intermittently failed because the master wasn't always ready to align. Solution: The alignee node now triggers a master conflict when the master is ready to align. |
TSTTOOL-164 |
Builtin topic filters do not take into account all CM* topics. If the user preference Hide Builtin (DCPS*) is set to true, the following topics would still be unfiltered: CMPublisher, CMSubscriber, CMDataWriter, CMDataReader. Solution: The filter pattern has been adjusted to filter out all DCPS* topics and all CM* topics. The preference page label has also been updated to reflect that. |
TSTTOOL-207 |
Partition combo boxes in Tester's AddReader Dialog does not always
show all existing partitions If user starts Tester with the "ospltest" command (instructing Tester to automatically connect on startup to ospl target using JNI), it is possible for some existing partitions to be missing from the partition comboboxes in AddReader, AddReaders and AddTopic Dialogs. This behavior is more prominent, the more partitions there are. Solution: Tester's Partition Manager is now created before any of its dependent components to ensure partitions are properly managed and available right at connection time. |
TSTTOOL-332 |
Mismatch in handling of unbounded character sequences between script
send and script check. In a scenario script, given a topic that has an unbounded sequence of characters in its type "cseq", the following script code would fail the check:
Passing in indexed parameters for unbounded character sequences is accepted
for send, but not for check.Solution: The check instruction can now accept indexed parameters for unbounded character sequences |
TSTTOOL-336 |
New example Tester scripts needed to show how to manipulate the Record
and Replay service. Solution: Composed some new example scripts that define record and replay scenarios, configures a storage, then allows to start and stop the scenarios on demand. They are now a part of the suite of example scripts. |
Report ID. | Description |
---|---|
OSPL-7425 |
Support for armv7-marvell-linux-gnueabi-hard Port for Ubuntu 14.04 64 bit host to Ubuntu 14.04 for custom ARM V7 board using Marvell Armada 385 processor and armv7-marvell-linux-gnueabi-hard_i686_64K_Dev_20131002 compiler |
Report ID. | Description |
---|---|
OSPL-7134 / 15140 |
Idlpp (cppgen) created ambiguous enum mutators. Idlpp (cppgen) generated both a value mutator and an r-value reference mutator for enum types. These mutators conflict when compiling. Solution: Idlpp (cppgen) will not generate the r-value reference mutator for enum type anymore. |
OSPL-7155 |
Failing group coherence on reconnects. During reconnect tests we discovered failures in the discovery of matching writers causing updates never to become complete. Solution: Improved discovery mechanism which solves several use cases of failing group coherent updates during discovery phase of communication end points. |
OSPL-7160 / 15268 |
Conditions cannot handle const functors. The different kinds of isocpp2 Conditions only supported non-const functors. This particularly caused problems when using lambda functions. Solution: Added support for const functors. |
OSPL-7227 / 15378 |
Multiple definitions of copy-routines in code generated by protoc gen-dds back-end The protoc-gen-dds back-end generates code to allow .proto messages to be transparently published/subscribed. When using multiple messages originating from different .proto files in a single application, some code gets duplicated causing symbols to be defined multiple times and linking of compiled code to fail. Solution: The offending code has been moved to a separate file, which is included by all files that use the construction instead of duplicating it in each file. |
OSPL-6802 |
A new merge policy called CATCHUP is available. This merge policy is similar to
the already existing REPLACE merge policy, but the resulting instance states may
differ. When nodes get disconnected their historical data sets may diverge. To recover from divergent states once the nodes get reconnected again, the durability service has defined various merge policies. One of these merge policies is the REPLACE merge policy. The REPLACE merge policy dispose and replace all historical data by the data set that is available on the remote node. Because all data is disposed first, a side effect is that instances whose state did not change will still be marked as NEW after the merge. For some use cases it is undesirable that instances which have not diverged are still marked as NEW. For that reason a new merge policy called CATCHUP is available now. Solution: The CATCHUP merge policy updates the historical data to match the historical data on the remote node by disposing those instances that are not available any more, and adding and updating all other instances. The resulting data set is the same as that for the REPLACE merge policy, but without the side effects. In particular, the instance state of instances that are both present on the local node and remote node and for which no updates have been done will remain unchanged. |
OSPL-7106 |
When durability terminates while persisting samples, a sample can be lost. When persistency is enabled, the durability service is responsible for writing samples to the persistent store. If durability terminates while writing samples to the persistent store, it may occur that a sample that has recently been extracted from the persistent queue but not stored yet, gets lost. Solution: The check to decide whether durability should terminate can now only occur before taking a sample from the persistent queue and after it has been written to the store, but not in between. This ensures that a sample that has been taken from the persistent queue will always be persisted. |
OSPL-7093 |
When a merge policy is applied, private groups are accidentally aligned. When a merge policy is applied, merge data must be requested from the master. It turns out that also data for private groups is requested when a merge is about to take place. This is not needed, as private groups are by definition local groups, and data for these groups should never be merged. Solution: When a group is marked as a private group, no data for that group will be requested anymore. |
OSPL-7136 |
Improved handling of nested GPB messages during proto files compilation. The compilation of nested GPB messages could cause compiler crashes when handling multiple proto files and/or proto files without the package keyword. Solution: The parsing part of the GPB compiler has been improved and made more robust. |
OSPL-7153 |
When durability terminates while a merge conflict is pending to be resolved,
memory may be leaked. When a merge conflict is detected, the durability service creates a conflict object and stores it in a queue. A conflict resolver asynchronously takes conflicts from the queue and tries to resolve them. In case a conflict object is pending in the queue and the durability service terminates, pending conflicts in the queue are NOT destroyed. This may lead to leakage of the conflict object. Solution: When durability terminates any pending conflicts in the queue are cleaned up. |
OSPL-7189 / 15316 |
Networking receive thread can spin at 100% following a disconnect/reconnect The RT networking service keeps track of a list of ACKs waiting to be sent. The code to cleanup this list following a disconnect contained an error that could corrupt the list if packet loss had occurred just before the disconnect. Solution: the cleanup is now robust against packet loss. |
OSPL-3817 |
The idlpp preprocessor crashed when when the idl specification
contains a sequence of a typedef of sequences. The idl preprocessor handles a sequence of a typedef of sequences incorrect which results in a crash of the preprocessor when it tries to parse such a construction. This caused by a missing condition in the preprocessor when it tries to pass this construction. Solution: The missing condition is added to the idlpp preprocessor to handle the sequence of a typedef of sequences correctly. |
OSPL-4403 |
Corrupt RnR storage when using a storage type that doesn't match
contents of existing file Two RnR storage backends are available: XML and CDR. When configuring a storage, either by publishing RnR config commands or in the OpenSplice configuration file, it's possible the data files already exist on the filesystem and contain data recorded during a previous session. When the recorded data is in XML format, it's possible to configure a new storage to use CDR format, or vice-versa, and append new samples to the storage in a different format than those recorded earlier. This corrupts the storage and it will not be usable for replay. Solution: The issue was resolved by adding verification of attributes when existing files are found. Changing a storage in a way that conflicts with existing files is no longer allowed. |
OSPL-7060 / 15088 |
When OpenSplice is stopped an application may remain blocked on
the WaitSet.wait operation. When an application is waiting in the WaitSet.wait operation and at that moment OpenSplice is stopped with the 'ospl stop' command or the spliced daemon is killed, then the WaitSet.wait operation will never return. This may occur even when the wait is performed with a timeout value. Solution: When the application is attached to the shared memory segment a callback is registered. When the spliced daemon which controls the shared memory segment is terminated the callback will be triggered. The callback detaches the application from the shared memory segment and triggers the WaitSet.wait to return OK together with the list of conditions that were detached from shared memory. When performing an operation on one of the returned conditions, ALREADY_DELETED will be returned. |
OSPL-7090 |
Publishing multiple RnR commands at once may result in some
commands being rejected by the service An issue in the ordering of commands received at the same time, may cause commands to be rejected if they depend on each other. For example a config-command that defines a storage must be processed before an add-record command that uses this storage, or the add-record command would be rejected because the storage is undefined. Solution: The issue was resolved by processing the commands in the same order as they are delivered to the service. |
OSPL-7126 / 15106 |
Crash of watchdog thread on VxWorks RTP shared memory deployments The VxWorks support for shared mutexes and condition variables does not match the POSIX pthreads behaviour required by OpenSplice core, with the OpenSplice abstraction layer translating between the two. Because of an issue in the abstraction layer, reuse of a shared memory address for a mutex could result in different processes using different VxWorks kernel mutexes. This in turn caused race conditions and, under some circumstances, crashes. Solution: the abstraction layer has been updated to not dissociate the mutex names from the kernel mutexes, thereby ensuring that all processes always agree on the kernel mutex to use. |
OSPL-7141 / 15185 |
Crash when getting IsoCpp null listener. Calling the listener getter function in IsoCpp when no listener was set, will cause a crash because it tries to use a null pointer. This null pointer is of an internal object that is created when a listener is attached. Solution: A check is added to the getter functions that will return null when the internal data object is null, meaning that no listener was set. |
OSPL-7142 / 15187 |
First few samples written after discovery of a remote node not
delivered Data is forwarded from the local writers (i.e., those attached to the same shared memory as the networking service in a shared memory deployment) only when the RT networking service has detected remote nodes, with some additional safeguards for data written while the system is starting up. When the first remote node was discovered, services (and applications) on the local node were informed of this event before the forwarding was enabled, which could cause data to be lost. A typical symptom is that the durability service fails to make progress. Solution: forwarding is now enabled before the existence of the remote node is announced. |
OSPL-7151 |
Error registering a non-scoped type in Java Attempting to register a type without a scope (so module-less) using the Java API no longer worked due to internal algorithm that simply assumed that each type was scoped. Solution: The internal algorithm has been modified to check whether the type that is registered has a scope. |
OSPL-7189 / 15316 |
Networking receive thread can spin at 100% following a disconnect/reconnect The RT networking service keeps track of a list of ACKs waiting to be sent. The code to cleanup this list following a disconnect contained an error that could corrupt the list if packet loss had occurred just before the disconnect. Solution: the cleanup is now robust against packet loss. |
TSTTOOL-296 |
Sample Display view and Check script instruction incorrect when
char sequence field followed by field with same initial prefix. If a topic data type contains a string field (or a character sequence or array) followed by another string field whose field name starts with the field name of the previous field, then the internal data model would mistakenly concatenate both fields into one field. Solution: The string field concatenation involving similarly named fields has been fixed. |
TSTTOOL-314 / 15087 |
Tester bounded sequence fields always send the max length of the
sequence. When Tester writes out a sample containing a bounded sequence, it always populates the full sequence length with values. Solution: The tool has been modified to only allocate sequence elements for elements that are actually defined, instead of allocating the full length with defaults first. NOTE: sequence elements must be defined in index order in a script. eg. send aTopic(seq[0] => 1, seq[1] => 1); is valid, while send aTopic(seq[1] => 1, seq[0] => 1); is not. |
Report ID. | Description |
---|---|
OSPL-6616 / 14697 |
Lack of reliability for first few packets on a channel can lead to unexpected behaviour Loss of the first few packets on a channel was not always detected and corrected. In case these packets belonged to the durability protocol for example, this could lead to unexpected behaviour. Solution: The reliable protocol is extended to support (re)transmitting packets to nodes that have not yet responded. Information about these nodes is shared across channels, so that when a single channel discovers reliable communication with a node on a NetworkPartition, all channels will. Furthermore, receiving nodes wait until they received the oldest packet they should receive before starting delivery of data. |
Report ID. | Description |
---|---|
OSPL-4153 |
Durability to support compression for KV-persistence The durability service currently has a KV-persistency implementation that allows persisting data to disk in either SQLite or LevelDB. Testing performance indicates that the disk is the bottleneck when trying to achieve a high throughput in some use cases. Therefore it should be possible to compress data when persisting. Solution: Samples can now be compressed by durability before persisting them (and obviously uncompressed before re-publishing them in DDS). Durability supports compression as a configurable option for KV persistence. For more details on how to configure it, please check section 4.3.3.9.3 of the Deployment Guide |
OSPL-6519 |
Durability service may not detect disconnecting federation The durability service could in some situation miss a dispose of DCPSHeartbeat, causing it to think that a remote federation is still running. That on its turn may prevent correct alignment of historical data. Solution: The algorithm that checks whether the DCPSHeartbeat is disposed has been corrected. |
OSPL-6950 |
DDSI2 not utilizing latency budget shorter than 1s The interface between the OpenSplice kernel and the networking services implements the latency budget, but requires the networking service to make a trade-off between idle wake-ups and efficient handling of short latency budgets. The DDSI2 service opted to have few idle wake-ups, but at the cost of essentially treating a latency budget setting < 1s as if set to 0s, which differs significantly from the RT networking service that puts the cutoff at 10ms by default (The RT networking service has useful work to perform in the idle wake-ups, hence the different trade-off). It should be noted that a high-rate writer typically manages to have some data in the network queue at all times, in which case it is not the latency budget that drives packing, but the sheer amount of data. Solution: DDSI2E now allows configuring this using the Internal/Channels/Channel[@name]/Resolution setting (similar to the RT networking setting). The default is unchanged. |
OSPL-6984 |
DDSI2 possible use-after-free following end_coherent_updates DDSI2 may find itself forced to grow a message buffer when sending an end-of-transaction commit message, which may result in the message header being relocated in memory. This could lead to the setting of a flag in the message header in freed memory. Solution: It now recomputes the address of the message header. |
OSPL-6985 |
Crash due to refcounting issue in large group-transactions The publishing side in group transactions larger than 50 writers may crash because of a refcounting issue in enlarging the group transaction administration. Solution: The refcounting issue has been fixed. |
OSPL-7013 |
DDSI2 limits max deployments on a node that are discoverable via unicast When DDSI2 tried to allocate a participant index when Discovery/ParticipantIndex = "auto", it limited itself to indices 0 .. 9. In practice, this meant that running more than 10 (single-process) deployments on a machine required multicast discovery and the ParticipantIndex option set to "none". The amount of unicast discovery pings sent out periodically is directly related to the maximum, which argues in favour of a small limit. However, it should provide a way of configuring a higher limit in cases where this is required. Solution: A setting has been added, Discovery/MaxAutoParticipantIndex, which configures the highest index to be tried automatically. |
OSPL-7032 |
C&M API does not allow creation of Publisher with one ore more
non-default immutable policies The C&M API is internally creating a publisher as enabled and tries to apply the QoS afterwards. If any of the immutable policies are immutable, this fails. This prevents tools from creating publishers with non-default values for immutable QoS settings. The implementation should create the entity disabled, set the qos and then enable the entity. Solution: The C&M API now creates the publisher in disabled mode, sets the QoS and enables the publisher. |
TSTTOOL-298 |
Protobuf feature for Tester does not account key and filterable
fields with name overrides. If a .proto data definition contains FieldOptions with "name" defined to override the name of the member in DDS, then Tester doesn't recognize it, and subsequently fails to read those fields. Solution: The feature now finds all the DDS specific FieldOptions defined in the protobuf metadata to find the real names of the key and filterable fields, and properly translates them between sample data view in tool, and sample reading/writing at middleware. |
OSPL-7027 / 14930 |
100% CPU Usage network service when resending locally rejected samples When a local datareader uses resource limits and runs into these resource limits, the networking service may go to 100% load when attempting to re-deliver a sample that is rejected due to the fact that the maximum resource limits have been reached. Solution: The networking service now reschedules the delivery of a sample to a datareader for the next resolution tick instead of attempting the re-deliver in a busy-wait loop until it is delivered. |
OSPL-7052 |
Unmatching Qos messages in ospl log and remote group detection
for client durability. The durability service uses various topic definitions for the implementation of the client-durability feature. The qos values for the max_blocking_time and lease_duration for these topics should be same as the defaults in the DDS specification, but due to a bug this was not the case. When a client generates a topic that does use the default qos values, a warning would appear in the ospl log file. Furthermore, the client durability server did not detect the creation of non-volatile writers on remote nodes. Consequently, the client durability server would not collect and align the data written by these writers. Solution: The durability service has changed the qos values for the topics related to the client-durability feature, so that the max_blocking_time and lease_duration qos values now conform to the DDS specification. Also, the durability service now detects the creation of non-volatile writers on remote nodes so that the server is able to collect and align the data written by these writers. |
OSPL-7089 / 15092 |
DURATION_ZERO_SEC and DURATION_ZERO_NSEC notation not recognized
by QoSProvider. The QoSProvider does not recognize the DURATION_ZERO_SEC and DURATION_ZERO_NSEC time values defined in the XML syntax. Solution: The missing time values are now correctly parsed. |
TSTTOOL-301 |
Writing octal character codes in c_char fields in script
scenarios fails validation. As of OpenSplice V6.5.0p7, a feature was introduced in Tester where values punched in to sample field parameters in a scenario script send or dispose command underwent a validation on execute to ensure that the typed in value fits in the known IDL type for that topic. The validation failed to consider the case for IDL char fields where octal character codes of the form was used as input. Solution: Script validation of topic fields of IDL type "char" now allows for values of the regex form \\[0-3][0-7][0-7] eg. a valid send instruction: "send aTopic(aCharField => '\000');" |
Report ID. | Description |
---|---|
OSPL-5473 |
Shared memory leak during alignment Implicit disposes weren't delivered for builtin topics, causing leakage in shared memory. There is a small chance other topics could potentially leak as well, but it was never observed. Solution: The implicit (internal) flag is now part of the equation to find duplicate messages. |
OSPL-6284 / 14504 |
Durability service fails to update service lease An issue in the durability service could result in a spinning thread during alignment of data from other durability nodes. The spinning thread triggers internal safety mechanisms, eventually preventing the service from updating it's service-lease which causes the spliced process to consider the service crashed or deadlocked. Depending on the configured failure-action, spliced may decide to terminate or restart all OpenSplice services. Solution: The cause of the spinning thread has been resolved. |
OSPL-6519 / 14561 |
Incorrect processing of DCPSHeartbeat by spliced and durability The durability service could in some situation miss a dispose of DCPSHeartbeat, causing it to think that a remote federation is still running. That on its turn may prevent correct alignment of historical data. The spliced thread responsible for processing the DCPSHeartbeat topic and thus monitoring the liveliness of other nodes in a domain, could get stuck reading old data. Depending on configuration (i.e. realtime thread priorities) this could result in unreasonably high CPU usage preventing other threads from running. This means the liveliness monitoring of other nodes is not reliable and removal of a node can possibly go unnoticed by the spliced processes on other nodes in the domain. Solution: The algorithm that checks whether the DCPSHeartbeat is disposed has been corrected in durability and spliced is now skipping data that has already been processed. |
OSPL-6683 / TSTTOOL-208 |
Tooling version compatibility constraint. Vortex OpenSplice Tester and Tuner need to connect to and read the necessary information present in C&M API minimum version of 6.5.0. Solution: When using C&M API to connect to a remote Vortex OpenSplice DDS system, a minimum version compatibility of 6.5.0 is now enforced in order to connect between local and remote. |
OSPL-6992 |
Some cross development installers are not built for the correct
architecture. In some cases where the host machine is 64 bit in a cross-development environment, the installer would be generated for a 32 bit host. On a subset of machines without 32 bit compatibility libraries the installer would not run. Solution: The installer generation logic has been updated to build for the correct architecture. |
OSPL-7010 / 14926 |
Deadlock while terminating during initial alignment of durability. Durability did not notice a fellow being removed during initialization causing the initial alignment loop to be infinite. Solution: The durability service now checks if a fellow is still alive before deciding to wait for the communication state to change. |
OSPL-7015 / 14928 |
C# API sample marshalling issues The C# API was having problems marshaling samples that had an attribute that was a sequence of a sequence and of demarshaling a sample that had an attribute that was a sequence of strings. Solution: Both issues have now been fixed. |
TSTTOOL-261 |
As a user I would want to see reason for failure of QOS compatibility in Tester Tester does not how reasons for failure of QoS compatibility. Solution: Display reasons of incompatibility from the tooltip when highlighted as incompatible reader. |
TSTTOOL-272 / 14761 |
Slow sending of large strings from Tester Tester executed non-linear code when sending strings, char sequences and char arrays. A 'send' instruction that contained a string of about 10K characters would take several minutes to complete. Larger strings took exponentially longer. Solution: Code was reworked so that send behaviour is now no worse than linear with the string size. Casual observations suggests that a send is close to constant time. Typical send times for strings up to 64K characters where observed to be in the 30-60 ms time frame. |
Report ID. | Description |
---|---|
OSPL-5473 |
Small memory leak in DCPS api listener dispatcher for C and C++. The list of observables monitored by the listener dispatcher leaked. Solution: Create the list object at creation instead of on the fly. |
OSPL-6867 |
Historical data that was requested by a client using the
client-durability feature could only be delivered to a single,
fixed partition instead of the partition preferred by the client. When the durability service receives a request for a client for historical data using the client-durability feature, the historical data could only be delivered to a single, preconfigured partition. Some use cases require the data to be delivered to different partitions. For this reason it is now possible to deliver the historical data to the partition requested by the client. This feature only applies to client-durability, the behaviour between durability services is not affected. Solution: Clients for client-durability can now specify the partition to receive historical data on a per-request basis. |
OSPL-6947 |
Tools and SOAP service may crash. A piece of memory may be freed to soon when deleting proxy entities, like our tools do (potentially through a SOAP connection). This leads to free memory leads that can cause the tools or SOAP service to crash. Solution: The memory corruption has been solved. |
Report ID. | Description |
---|---|
OSPL-6900 / 14883 |
Leak of global references in Classic Java PSMs. A global reference was created for some Java classes related to DDS entities such as DomainParticipantImpl, TopicImpl etc. These references were never deleted so a class instance would leak every time an entity is deleted though the corresponding DDS calls like factory.delete_participant or participant.delete_topic. Solution: The issue was resolved by deleting the global reference when an entity is deleted. |
OSPL-6896 / 14878 |
Durability crash during alignment. When durability is used and during alignment of a fellow, the fellow disconnects durability could crash. Solution: The alignment functionality is adjusted so that this crash cannot happen anymore. |
OSPL-6797 |
DDSI2 needs to support multicast on platforms that incorrectly declare multicast unsupported. DDSI2 relies on the network interface capabilities it reads from the operating system kernel to determine whether or not multicasting is supported on the selected interface, but some platforms do not mark the interface as supporting multicasts even though it does in reality. The workaround of setting the flag manually on the interface requires elevated privileges, which is not always acceptable. Solution: An option Internal/AssumeMulticastCapable is now available in DDSI2, which can be set to a comma-separated list of interface name patterns (i.e., including ? and * wildcards) that are assumed to be multicast capable. |
OSPL-6889 / 14875 |
Crash of spliced during termination while application(s) still running. Normally when the spliced process is terminated, application participants have already detached from the shared database so the database can be removed and OS resources reclaimed during spliced termination. However it is also possible that applications are still running when spliced is terminated. In that case the database will be detached and OS resources can only be reclaimed after the last application participant has also detached from the database. An issue could lead to a crash during spliced termination in the latter case. Because the database is detached but not removed, another spliced thread would try to access it resulting in undefined behaviour. Solution: The issue was resolved by preventing the offending spliced thread from accessing administration data during termination stage. |
OSPL-6912 / 14885 |
DDSI2 can mismatch sockets and participants in "many sockets" compatibility mode. DDSI2 can operate in a compatibility mode in which each participant gets its own socket, in which case there is no requirement to include an explicit destination address in messages intended for just one participant, but the mapping from socket to participant could be done incorrectly. That results in the wrong participant being addressed, which typically results in dropping a message, but could also lead to data being considered acknowledeged when it fact it hadn't been. The problem is only known to occur when interoperating with TwinOaks CoreDX product. Solution: The mapping has been corrected. |
OSPL-6909 / 14886 |
Deadlock during durability exit. When durability is terminating it could end up in a deadlock when terminating its internal listeners. Solution: The listener termination mechanism within durability is adjusted and the deadlock cannot occur anymore. |
OSPL-6894 / 14876 |
The u_domain object is not freed causing a memory leak The u_domain object was not freed even though no other object had a reference to it. Solution: When the last participant is freed and thus all the references to the u_domain object are release the u_domain object is freed. |
OSPL-6419 |
Possible crash in C++ when calling find_topic and deleting participant in other thread When having a deplicated C++ participant and one thread blocks in a find_topic while the other deletes the participant a crash could occur because the domain was unaware of the deletion of the kernel participant object. Solution: The domain is now aware that the deletion of the kernel participant object. |
Report ID. | Description |
---|---|
OSPL-6781 / 14838 |
Possible deadlock during durability clean up. When delete_contained_entitites is called it is possible that durability could end up in a deadlock. Solution: The clean up mechanism of durability is adjusted so this deadlock cannot occur anymore. |
OSPL-6707 |
Perc JVM error when using waitsets. When using the Perc JVM in combination with waitsets an error occurs: java.lang.NoSuchMethodError: get_trigger_value. Solution: The error is fixed and will not occur anymore. |
OSPL-6555 / 14568 |
OpenSplice installer could fail when installing OpenSplice as service. When installing OpenSplice as service and the Microsoft Visual C++ Runtime redistributable is not installed on the system the installation could fail. Solution: The installer is adjusted to install the Microsoft Visual C++ Runtime redistributable before installing the service. |
OSPL-4466 / 12862 |
C# examples give warnings. When opening the OpenSplice C# example project files warnings with the text 'Load of property 'ReferencePath' failed' occur. Solution: The project files are adjusted so this warning will not occur anymore. |
OSPL-6759 / 14832 |
Typographical error in ospl traces. The durability service contained a typographical error in the traces. Solution: The typographical error is corrected. |
OSPL-6839 / 14865 |
Warning during termination of cmsoap service. When the cmsoap service is exiting a warning message "Received termination request, will detach user-layer from domain." in the info log can occur. The cmsoap service wrongly registers 2 exit handlers. Both of them are executed when the service is requested to terminate and eventually both do the same thing under the hood. The exit handler registration is adjusted and now only one handler is registered for the cmsoap service which also removes this warning message. |
OSPL-6519 / 14561 |
Incorrect processing of DCPSHeartbeat by spliced. The thread responsible for processing the DCPSHeartbeat topic and thus monitoring the liveliness of other nodes in a domain, could get stuck reading old data. Depending on configuration (i.e. realtime thread priorities) this could result in unreasonably high CPU usage preventing other threads from running. This means the liveliness monitoring of other nodes is not reliable and removal of a node can possibly go unnoticed by the spliced processes on other nodes in the domain. Solution: The issues were resolved by skipping data that has already been processed. |
Report ID. | Description |
---|---|
OSPL-6541 |
Tuner doesn't show entity-relations in the partition view The Tuner tool no longer shows entity-relations when selecting the 'partition' view. Solution: An internal algorithm to find dependant entities has been modified to support partitions again. |
OSPL-6591 / 14606 |
OSPL waits 10 seconds before exiting with a wrong report plugin When OSPL is started and a configured reportplugin is missing OSPL always waits 10 seconds before exiting. Solution: OSPL now directly exits and does not wait 10 seconds anymore. |
OSPL-6628 / 14751 |
SIGKILL on an application with listeners terminates OpenSplice in an unexpected way. When SIGKILL is used on an application which uses listeners OpenSplice terminates with a message that it could not properly clean up its resources. Solution: There was a problem in the listener cleanup mechanism. This is now fixed and all resources are now properly cleaned up. |
OSPL-6682 / 14763 |
The ospl tool help for return code 16 is incorrect. The ospl tool help says "not available" instead of "non existent" for code 16. Solution: Non-existent is now also reported in the ospl help for code 16. |
OSPL-6743 |
The calculation of a key hash for a key type that is 8-bytes could lead to a crash. The client-durability feature calculates a md5 key hash for topic keys that do not fit in 16 bytes. The hash calculation for an 8-bytes type key contained an error, leading an invalid key. Access to such key could lead to a crash. Solution: The hash calculation is changed so that the correct hash is calculated. |
OSPL-6748 / 14821 OSPL-6760 / 14833 |
Old sample erratically rejected causing weird resending behaviour. Due to an issue with handling the case that an old sample is not stored in the reader because there is already newer data available (KEEP_LAST), a writer (or service) would begin to retry delivering that sample. As long as the state of the reader/instance doesn't change, this will continue to behave wrongfully. Solution: The proper return-code is now returned, causing the sample to be accepted instead of rejected. |
OSPL-6690 |
Incorrect use of the '#' character in the RTNetworking protocol for topic-/group-coherency Due to a bug in the handling of transaction-markers in the RTNetworking protocol, partition-names starting with a '#' could cause problems. Solution: The processing of transaction markers in the RTNetworking protocol has been fixed. If coherency with access_scope V_PRESENTATION_TOPIC is used on a build since V6.5.0p5, this will not interoperate with this fix included and an upgrade must be performed to get the bug resolved. |
Report ID. | Description |
---|---|
OSPL-6687 |
When a sample request is received from an unknown fellow the durability service can crash When a sample request is received from an unknown fellow, the response that is generated contains flawed data which could cause the durability service to crash. Solution: The response now contains valid data so that the receiving durability service does not crash anymore. |
OSPL-6685 |
DDSI2 may "hang" reading from a socket When DDSI2 was configured to use only a single unicast port, it would attempt to read two packets from the socket corresponding to the unicast port when signalled that data was available. Usually, a packet would arrive in short order, but if it didn't happen DDSI2 would appear to hang (stop processing messages, fail to terminate properly). Solution: It now only reads when data is known to be available. |
OSPL-6588 / 14609 |
ExitRequest handler interferes with Java shutdown hook The exit request handler that OpenSplice installs runs before the JVM runs the shutdown hooks. This causes a problem for an application that tries to do some clean up when the JVM shuts down due to failing DDS operations in Java with ALREADY_DELETED as result. Solution: For Java OpenSplice now does nothing for the posix signals SIGHUP, SIGINT and SIGTERM and windows signals CTRL_C_EVENT and CTRL_BREAK_EVENT before the JVM runs the shutdown hooks. This way an application can still do some clean up. After the application clean up the JVM will then trigger the OpenSplice clean up. |
OSPL-6622 / 14699 |
Possible NULL pointer dereference could cause durability service
to crash on segmentation fault Removal of a fellow assigned NULL to the request member of a chain object. This could cause the durability service to dereference a NULL pointer, causing it to crash with signal 11. Solution: Upon removal of a fellow the request member of a chain object is now copied. This fix ensures the request member is properly freed if the reference count of a chain object reaches zero. |
OSPL-6662 |
API call create_persistent_snapshot execution time When API call create_persistent_snapshot is called it will take always at lease one StoreSessionTime to execute. This could lead to extra or less data in the snapshot than would be expected. Solution: The snapshot will now be executed as soon as possible and will not wait at least one StoreSessionTime. |
TSTTOOL-180 |
Difficult to send amended topic instances in OpenSplice Tester Tester scenarios in which a topic instance is sent, and then repeatedly amended and resent were tedious, and error prone to write, as each send instruction had to repeat all sample fields, even if only a few had changed. Solution: A new 'update parameter' (composed of the topic name followed by _update) is now accepted by the send instruction. If the update parameter is present, the sample data sent will be preserved by the topic reader. At most one sample per topic instance is preserved. If the update parameter's value is 'true', then the sample to be sent is initialized from the previously preserved sample for that topic instance, provided the topic reader has retained it. In all other cases, the sample to be sent continues to initialized from topic defaults. If a send instruction without the update parameter sends a sample, then the topic reader clears the saved sample data for the topic instance sent. Disposing the reader clears all retained sample data for that topic. |
Report ID. | Description |
---|---|
OSPL-3824 |
C&M API cannot handle array of arrays, array of sequences,
sequence of arrays and sequence of sequences The serializer contained bugs that caused the proper elements not to be found. Solution: Serializer now uses proper name generation functions where applicable and simplified retrieval of user data elements. |
OSPL-5245 |
Write of c_char field would fail on writing certain characters to ospl log files The log contained error messages when writing certain special characters, even though the write would actually succeed for some of them. Solution: Fixed a case in the serializer where memory would be released twice, stop write on validation errors, and fixed scanning for octal sequences |
OSPL-6141 / 14430 |
Deadlock in signal handler on trapping synchronous signal in signal handler The signal handler would deadlock if a synchronous signal was caught by the signal handler thread itself. Solution: The signal handler will not try to gracefully handle a synchrounous signal trapped by the signal handler itself. |
OSPL-6156 |
In order to retrieve historical data on a node a durability service
must be configured. For nodes with limited resources running that
still want to acquire historical data a full-fledged durability
service may not be desirable. Until now, if a late joiner wants to receive historical data, the late joiner must run a fully-fledged durability service. If the late joining device runs on a platform with limited resources, running a fully-fledged durability service may not be desirable. Instead of running a full-fledged durability service, an alternative way to acquire historical data whilst being on a platform with limited resources has been implemented. This feature is called client-durability. Using client-durability, a client can send out a request for historical data to a durability service that implements the client durability protocol. The server will then provide the requested data to the client. To enable the client-durability protocol in a durability service the Opensplice/DurabilityService/ClientDurability[@enabled] must be set to true. More information is available in the deployment manual. The client-durability feature is currently an internal feature of OpenSplice and not available for applications. Solution: An alternative approach to acquire historical data has been implemented that does not require running a fully-fledged durability service on the client node. |
OSPL-6626 |
Heap and SHM leakage when using the tester in combination with the Soap service When using the Tester in combination with the Soap service topic related heap and SHM leakage occurs. Solution: The topic leakage has been fixed. |
OSPL-6490 |
Crash of RnR service when storage path doesn't exist The RnR service would not allow creation of a storage in a non-existing path. This caused the service to publish a storage-status update to indicate the storage is in an error state, but a bug caused the service to crash shortly after updating its state. Solution: The behaviour was changed and the service will now create the required path if it doesn't exist. If the creation fails, i.e. due to permissions or a corrupt path name, the service updates the storage-status topic accordingly without crashing. A config command can then be issued to correct the path name and make the storage accessible to record and/or replay commands. |
TSTTOOL-260 |
No reliable way to recheck the last sample of topic instance in OpenSplice Tester Tester instructions such as check_last and check_any would not reliably find the most recent sample for a topic instance. Solution: A new instruction, 'recheck_last' was introduced. It's syntax is identical to check_last, but it's behaviour is different: recheck_last will always check the most recent sample received, where as check_last fails if that topic has already been checked by a previous instruction. |
TSTTOOL-180 |
No way to invoke a JavaScript without invoking another instruction in OpenSplice Tester The scenario syntax did not allow for the direct invocation of a JavaScript (or other script language). Instead, users needed to include the script as part of another instruction, such as the log instruction. For scenarios making heavy use of scripts, this was both inconvenient, and resulted in unnecessarily large log files. Solution: Scripts are now allowed in scenarios at the same level of as other instructions. The script must be enclosed in back quotes (`). The script invocation block (script enclosed in its back quotes) must be terminated by a semi-colon. |
TSTTOOL-179 |
OpenSplice Tester provides no feedback on data conversion errors Sending a sample which included invalid data (e.g. sending a floating point value to an integer field) were silently ignored. No entries were made in the Tester log. Solution: Tester now checks for invalid values, and creates appropriate log entries. |
TSTTOOL-194 |
OpenSplice Tester fails reading scenario files in some cases Tester would fail to start if it did not have permissions on all the folders containing the specified script and/or macro paths. Tester could open the wrong script/macro file if a backup copy with a tilde (~) appended to the full name was present. Solution: Tester correctly opens starts and opens scenario and macro files for which it has at least read access. |
Report ID. | Description |
---|---|
OSPL-5888 |
Listener scheduling QoS settings were not being honoured Listener scheduling QoS (changes) where not applied to the listener dispatcher thread. Solution: The QoS operations involved in setting and updating listener scheduling changes have been reworked and the listener dispatcher code was altered to be event based. |
OSPL-6156 |
For nodes with limited resources running that still want to
acquire historical data a full-fledged durability service may not be
desirable. Until now, if a late joiner wants to receive historical data, the late joiner must run a full-fledged durability service. If the late joining device runs on a platform with limited resources, running a full-fledged durability service may not be desirable. Instead of running a full-fledged durability service, an alternative way to acquire historical data whilst being on a platform with limited resources has been implemented. This feature is called client-durability. Using client-durability, a client can send out a request for historical data to a durability service that implements the client durability protocol. The server will then provide the requested data to the client. To enable the client-durability protocol in a durability service the Opensplice/DurabilityService/ClientDurability[@enabled] must be set to true. More information is available in the deployment manual. The client-durability feature is currently an internal feature of OpenSplice and not available for applications. Solution: An alternative approach to acquire historical data has been implemented that does not require running a full-fledged durability service on the client node. |
OSPL-6194 |
When there are many pending sample requests the durability service
becomes very slow. The durability service internally keeps a list of pending sample requests. Each time a new request is received the durability service will traverse the list to see if there exists a pending sample request for the same partition and topic. If so, these requests will be combined. The algorithm to find if a request is already pending used an inefficient implementation, causing high CPU load and unnecessary delays in situations where the list of pending sample requests is very long. Solution: The implementation to see if a request is already pending now uses an optimized algorithm. |
OSPL-6258 |
Liveliness of a remote node not always consistent among services, built-in topics A number of independent heartbeat mechanisms were used in parallel, resulting in different parts of a single federation having short-term inconsistencies between their views of the set of live remote nodes as well as introducing dependencies between the configuration of services to ensure dependencies between the services (e.g., networking and durability) in removing a node were handled correctly. Solution: The mechanisms have now been consolidated. |
OSPL-6272 |
Networking bridge log verbosity The configurator contains a non-existent value 'finer' for the networking bridge tracing level verbosity. Solution: The value finer is removed from the configurator. |
OSPL-6320 |
Bind error when reconnecting after a network adapter does down and then up. When trying to reconnect after an ethernet adapter went from down to up a 'bind returned errno 22 (Invalid argument)' message could occur in the ospl-error log. Solution: The reconnect mechanism is adjusted and the error will not occur anymore. |
OSPL-6332/ 14527 |
ISOCPP listeners might reference invalided entities ISOCPP listeners used a raw pointer to reference DDS entities on stack that might have been invalidated. Solution: ISOCPP listeners now use a dds::core::WeakReference to keep a reference to DDS entities. |
OSPL-6385 |
Durability service internal algorithm to read protocol samples
should be optimised The durability service creates datareaders to receive information from fellow durability services to align historical data. The internal algorithm to read this data, takes one sample at the time where taking all samples at once would be more efficient processing-wise. Solution: The durability service has been modified to take all available samples at once. |
OSPL-6408 |
Invalid default max_blocking_time for DataWriter with BEST_EFFORT
reliability in ISOC++ API The ISOC++ API set the default Reliability.max_blocking_time to zero instead of 100 ms for DataWriters with BEST_EFFORT reliability. Solution: DataWriters now always have 100ms as default max_blocking_time no matter their reliability. |
OSPL-6437 |
Topic-access feature behaves unexpectedly When using the Topic-access feature unexpected errors can occur in OpenSplice because the builtin topics are also slaved to this setting. Solution: The builtin topics ignore the topic-access setting so OpenSplice will keep working. |
OSPL-6454 |
Installer crashes in text mode A fault with the third party installer creation program causes OpenSplice to crash when text mode is used Solution: The install cration program for OpenSplice was fixed. |
OSPL-6477 |
Default Secure Networking configuration files contain an error The default secure networking configuration files contained a fault which could lead to an error when loaded into the configurator. Solution: The fault is fixed and the configurator can load the file without errors. |
OSPL-6488/ 14510 |
Tuner problem when writing a bounded character array There was a fault in the tuner when trying to write a bounded character array. When the array is written the data was ignored. Solution: The array is not ignored anymore and the data will be written. |
OSPL-6491 |
Remove dependency on CORBA::string_dup in RMI Vortex Lite shares some of the RMI codebase, but does not support CORBA co-habitation. rmipp generates code that includes CORBA::string_dup that Vortex Lite does not support. Solution: rmipp changes to generated DDS::String_dup instead that is supported by both Lite and OpenSplice. |
OSPL-6496 |
Adding RMI support to operate without a durability service RMI services registration and discovery relies on Transient DurabilityQosPolicy as provided by a durability service in OpenSplice. In some specific deployment conditions, that service is not available. RMI should be adapted to support that use case. Solution:A new command line option called "--RMIDurability = yes | no" was added to indicate the availability of the durability service. Please refer to the RMI documentation for more details. |
OSPL-6531/ 14566 |
DDS-RMI OSGi bundle not exporting all required packages An issue in the manifest for the OSGi bundle of DDS-RMI, caused the bundle to be incompatible with application bundles that contain code generated by rmipp. Solution: The DDS_RMI.Impl packages were added to the Export-Package manifest attribute |
OSPL-6640/ 14547 |
Problem with license files The information to get a license is not accurate for the Windows platform in the getting started guide. Solution: The getting started guide has been updated to reflect the proper command on Windows. |
Report ID. | Description |
---|---|
OSPL-6464/ 14412 |
Semaphore handle leak on Windows An issue in the OS abstraction layer of the product caused a leak of semaphore handles. When a condition variable, used in many areas of the product i.e. when notifying a WaitSet or acknowledging a synchronous write, is triggered while no other threads are blocking on the condition, a shortcut is possible. However this shortcut contains a bug and leaks a semaphore handle. Solution: The issue was resolved by making sure the handle is closed in all circumstances. |
Report ID. | Description |
---|---|
OSPL-2967 |
Deadlock in Java listeners Terminate from within a listener callback required a new Thread from within the callback and call System.exit() in that thread. Solution: The additional thread is no longer required. |
OSPL-5684 |
Reference guide update for SchedulingQosPolicy
The reference guides did not mark SchedulingQosPolicy as ADLINK proprietary property. Solution: Minor updates to guides only. |
OSPL-3936 |
DDSI2 memory leak when retransmitting part of a fragmented sample
because of retransmit queue limiting
When DDSI2 attempts to retransmit a fragmented sample, but reaches the retransmit queue limit at the first fragment, it can leak one HeartbeatFrag message. Solution: Memory leak fixed. |
OSPL-4471 |
Configurator does not support fields other than string values to
contain environment variables
The configurator does not support fields other than string values to contain environment variables i.e. ${DOMAIN_ID} Solution: The configurator does now support environment values for every input type except enum types and booleans. However Enum types/boolean environment variables are supported by all services so by editing the configuration file manually it still can be used. |
OSPL-5557 |
Starting Tester in an environment where the data types for
OsplArrayTopic and OsplSequenceTopic were already registered,
caused errors to be printed in the ospl error log.
Tester tries to register its topic types on startup. However, due to a slight change in the way the meta type is defined internally, if the topic types were registered previously, then trying to register them again caused a meta type mismatch error. Solution: The XML meta type definitions for OsplArrayTopic and OsplSequenceTopic have been updated. The error no longer appears. |
OSPL-5575 |
Begin/End Coherent Changes API calls did not return the correct
return code in case of a failure.
When using the Begin/End Coherent Changes API calls it was possible that an incorrect return code was returned. Solution: The calls now return a correct return code as per spec. |
OSPL-5739 |
When there are other users running OpenSplice then the command "ospl stop -a" may fail.
The ospl stop -a command may fail when there are other users running OpenSplice. The ospl stop -a command tries to open all key files associated with the running OpenSplice instances. When it cannot open a key file because it belongs to another user the ospl stop -a command stops and reports an error. Solution: When the ospl stop -a command walks over the key files and finds a file it cannot open it should not stop and instead continue parsing the next key file. |
OSPL-6030 |
Java OSGI Support
Only the standalone Java language binding had an OSGi compliant jar file. It was a separate jar file called dcpssaj-osgi-bundle.jar. Solution: The following jar files are now OSGi compliant: dcpscj.jar, dcpscj5.jar, dcpssaj.jar, dcpssaj5.jar, ddsrmi.jar and rlm.jar. These OSGi jar files can still be used as 'normal' jar files. No extra separate OSGi jar files needed. Only dcpssaj-osgi-bundle.jar is maintained for backwards compatibility. |
OSPL-6192 |
Incomplete error messages
A review of error log output across the OSPL APIs has been completed and a number of reports have been improved. |
OSPL-6285/ 14506 |
Non terminated AsyncReplyWaiter threads in case of concurrent
asynchronous methods invocations
When the application invokes multiple asynchronous requests concurrently, multiple threads called AsyncReplyWaiter are created and not terminated by the runtime stop operation. The AsyncReplyWaiter thread is the thread that waits for asynchronous replies and dispatches them to the user asynchronous reply handlers. There should be only one AsyncReplyWaiter thread per process that is terminated at runtime stop. Solution: The AsyncReplyWaiter thread creation has been properly synchronized to avoid creating multiple AsynchReplyWaiter threads. |
OSPL-6286/ 14507 |
RMI asynchronous reply handler may be called back for an invalid reply
This may happen when the rmi server stops running for some reason. If a client issues an asynchronous request after that, it may receive an invalid reply including random values. Solution: The RMI asynchronous reply has been updated to process only valid reply topics samples. |
OSPL-6323 |
Durability can deadlock on discovering an unclean shutdown of a
remote fellow
There are various ways in which the durability service can discover that a remote fellow no longer exists, and under most circumstances will detect this by the absence of a heartbeat. There are however some other ways that are only rarely used, and one of these would result in a deadlock in durability, and in such a way that it would have significant consequences for the rest of the federation. Solution: This deadlock has been fixed. |
OSPL-6351/ 14533 |
Shared memory monitor sometimes couldn't find specifyied domain name
Shared memory monitor couldn't properly compare domain names because of trailing whitespace. Solution: Updated keyfile parser to trim lines before analyzing them. |
OSPL-6352/ 14532 |
Warnings when compiling c++ header files with-Wignored-qualifiers using gcc 4.8.2
When compiling code with OSPL c++ include headers files with the -Wignored-qualifiers compiler flag warnings show up. Solution: The c++ include files now compile without warnings when -Wignored-qualifiers is set. |
OSPL-6371 |
Unable to use Foo type name in classic C++ PSMs
Compilation errors occur when building code generated by idlpp, if the application idl specifies a type named Foo. This type is also used internally, leading to name clashes with existing classes and methods. Solution: The issue was resolved by using fully scoped names in idlpp templates. This ensures the classes from the DDS::OpenSplice namespace are used instead of the application namespace. |
OSPL-6387/ 14539 |
Attaching conditions to a waitset during a wait could cause a crash
Due to an error in the administration of the waitset, a crash could occur if the internal administration of the waiset grew to contain over 32 conditions while waiting on the waitset. Solution: The error in the administration has been resolved. |
OSPL-6402/ 14542 |
Wrong directory path in windows PATH variable
When installing OpenSplice on Windows and choosing to let the installer set the environment variables the installer adds a non existing directory path (%OSPL_HOME%/host/lib) to the PATH variable. Solution: The installer does not add the non existing directory path anymore. |
TSTTOOL-203 |
Tester unable to handle bounded and unbounded arrarys and
sequences of characters.
Solution: Tester now properly reads in and write to the DDS, field data of bounded and unbounded arrays and sequences of characters. This feature is fixed in both the sample editor dialog (in the UI) and the scripting engine. |
OSPL-6431 |
An API_INFO report message is reported in the ospl-error.log but
should be reported in the ospl-info.log.
When an API_INFO report message is reported in the context of an API call then the API_INFO report is mistakenly reported in the ospl-error.log instead of the ospl-info.log. Solution: When an API_INFO message is reported the ospl-info.log file is selected instead of the default ospl-error.log. |
Report ID. | Description |
---|---|
OSPL-5859 / 14325 |
Durability alignment may require larger amount of shared memory than expected
During the alignment process, the durability service may collect historical data from multiple sources and these sources may (partly) have overlapping sets of historical data. When these data-sets are received, durability needs to filter out the duplicates and once all data from all sources has been received, republish the set locally. Due to an issue in a durability algorithm, the duplicates where only filtered out at republishing time and not on reception time, causing an temporary, but unnecessary, need for a larger amount of shared memory. Solution: The flaw in the durability algorithm that filters out duplicate samples on reception has been repaired. |
OSPL-5930 / 14351 |
Invalid configuration of networking service not detected at startup and causing a crash at termination.
In case the networking service is started with a missing or invalid configuration, the service would still run but a connection with other nodes is never established. At termination the service could crash on freeing internal administration related to configuration parameters. Solution: The service will now refuse to start when mandatory elements of the configuration, such as channels, are missing or invalid. An appropriate report is logged to the error log. |
OSPL-6083 |
iShapes example readers should be created with INFINITE latency-budget QoS
To allow demonstrating (the effects of changing) LATENCY_BUDGET, the readers should be created with an infinite latency-budget as this is a RxO Qos. Solution: readers are now created with INFINITE latency-budget. |
OSPL-6176 |
Durability may report error if no persistent data exists on disk
The durability service may report the error 'Unable to resolve persistent data version.' during start up when there is a topic definition on disk already, but no data yet. Solution: The error is now no longer reported under these particular circumstances that are deemed valid. |
OSPL-6279 / 14503 |
All XML documents received by registered report plugins start with an error element
All XML documents received by registered report plugins started with an error element instead of using a string representation of the report type passed when generating the report Solution: The string representation of the report type is now properly used to start and end the XML document |
OSPL-6280 |
A non-OpenSplice reliable writer does not to deliver data to an
OpenSplice best-effort reader using DDSI when on the OpenSplice side
RnR is enabled.
As RnR was using a reader with private QoS setting named QoS matching relaxing to receive both best-effort and reliable data. DDSI implementations that didn't understood this setting saw the reader as a best-effort reader. When a sample was lost, the RnR reader would request a retransmit, as this is required for the reliable data. However, as the reader was seen as a best-effort reader by the writer, this retransmit request was ignored, causing the data to be never delivered. Solution: RnR will now create separate readers for each non-matching QoS setting (reliable/best effort and shared/exclusive ownership) |
OSPL-6282 |
iShapes GUI badly formatted
The publisher buttons of the iShapes demo application are badly formatted compared to the rest of the interface. Solution: Fixed the formatting. |
OSPL-6288 |
User data QoS empty in participants discovered via DDSI2
An issue in the processing of the participant QoS caused the UserDataQoS to be empty for all remote participants discovery via DDSI discovery. This issue did not affect systems using RTnetworking. Solution: UserDataQos is now exchanged again during discovery. |
OSPL-6297 / 14516 |
Find topic timing behaviour incorrect
The find topic method on a DomainParticipant has a timeout parameter that specifies the maximum blocking time to wait for the topic to appear. In certain cases the method would block for 100 ms too long, which can have a noticeable impact on the application when called often on topics that don't yet exist. Solution: The implementation was changed to never exceed the maximum timeout. |
OSPL-6295 |
Windows start menu entry needs to be moved
The location in the start menu on Windows does not match other Vortex products and needs to move to 'PrismTech/Vortex OpenSplice' Solution: Moved start menu entry to requested location. |
TSTTOOL-167 |
Tester hangs and won't close if soap connection fails
Ospl tester hangs when the soap connection fails due to an issue in an internal algorithm. Solution: Tester does not hang anymore when soap connection fails. |
TSTTOOL-182 / 14349 |
Ospl tester appears to lock up
Ospl tester appears to lock up due to the fact in runs out of memory when consuming large amount of samples. Solution: Added new preference to limit number of samples kept per reader under File>Preferences>Settings>Max Samples Per Reader The preference only takes only integer values. Default value is "0" which means infinite number of samples kept. |
TSTTOOL-184 / 14348 |
New instance not automatically displayed in tester
In Tester, if an application data writer has a QoS of autodispose_unregistered_instances set to false and then unregister_instance is called on a data writer for some instance, a sample reaches matching data readers with the no_writers state and is ignored. Solution: A new boolean option has been added in the Preferences menu under the Settings tab, called "Ignore not_alive_no_writers samples" and is set to true by default. If the option is set to false, then these specific samples will be displayed in Tester. |
TSTTOOL-190 |
Tester does not show Vortex Lite ishapes instances in its
topology unless user disconnects and reconnects
Tester's Reader/Writer Tables (in the Browser Section) does not show all partitions for readers/writers with more than 2 partitions. Solution: Updated Reader Info and Writer Info to retrieve all the available partition names from the user data and display them properly in the Browser tab. |
TSTTOOL-191 / 14350 |
No convenient way to select an existing partition
Ospltest tool does not provide a convenient way to select an existing partition when creating a reader. Solution: User can now select from a list of existing partitions when trying to create a new reader. |
TSTTOOL-192 / 14352 |
Ospltest gives errors on startup
The script directories are not included in the RTS installers. Solution: Updated scripts and install directories to fix these errors. |
TSTTOOL-194 / 14443 |
Tester Script and Macro paths
The Tester does not have permissions for every file/folder within and below the path specified for its script/macro area it will not start. Solution: Tester is now updated to read files with extensions .sd, .md and .bd. |
TSTTOOL-195 / 14447 |
Ospl tester and marks being treated as failures
If Tester is receiving a very large amount of samples, and a mark command is run, the execution of the command wants to iterate through the sample list to mark all the samples it needs to ignore. However, if this iteration does not complete before the next sample comes in from the data reader, then it can throw this exception. Solution: Fixed tester to add marks properly. |
Report ID. | Description |
---|---|
OSPL-6250 |
Java5 Protobuf example Build.bat script is not working
The Build.bat file that is delivered to compile the Java5 Protobuf example contains a copy-paste error from the one derived from Linux. Solution: Bat file has been modified to allow compilation of the example on Windows |
OSPL-6250 / OSPL-6263 / OSPL-6270 |
Java5, ISO C++ and C# applications may crash on termination
Applications using the Java5 DDS API may crash at termination time due to an issue in the automatic clean up procedure. Solution: The issue in the automatic clean up procedure has been fixed. |
OSPL-6253 |
ISOC++ listeners may be deleted to soon
Applications using the ISO C++ DDS API may experience that entities have already been deleted during listener call backs. Solution: The entity deletion procedure now waits for potential listener callbacks to finish before deleting the entity. |
Report ID. | Description |
---|---|
OSPL-1722 |
Synchronous reliability not supported when using DDSI2
OpenSplice DDS' synchronous reliability feature was not supported using DDSI2 and resulted in timeouts on the writer side because one of the built-in endpoints needed for processing the acknowledgements could not be discovered by DDSI2. Solution: The built-in endpoint is now discovered by DDSI2 and synchronous reliability is now supported. |
OSPL-3837 |
Tuner generates error when starting with the -uri argument
When the Tuner is started with the -uri argument, an error is generated in the error log file. Solution: The error is not generated anymore. |
OSPL-4841 / 12996 |
Crash in WaitSet_wait for Java application
In some situations a java application could crash in a Waitset_wait. Solution: The java api language binding has been rewritten. With this new implementation the code causing this crash is not present anymore. |
OSPL-4383 |
Serializer serializes wrong double/float values when LC_NUMERIC
is not equal to the default "C" locale.
The LC_NUMERIC of the "C" locale is '.'. OpenSplice assumes doubles and floats to always have a '.' as LC_NUMERIC. But a few locales use a different LC_NUMERIC (like fr_FR and nl_NL use ','). The (de)serializing of doubles and floats is locale dependent and thus this LC_NUMERIC difference will cause problems when a different one is set on the system. Solution: Make double/float to/from string conversions locale LC_NUMERIC independent. |
OSPL-5095/ 13095 |
No convenient RMI API for non default DDS domain
Most of the RMI features are available via the org.opensplice.DDS_RMI.DDS_Service class, but this class works on the default DDS domain only. Solution: The Java and C++ RMI API has been extended so that the RMI applications can use any DDS domain id in a convenient way. As the CRuntime object is tied to the used DDS domain, a new method has been added to that class with the following signature : DDS_ServiceInterface getDDS_ServiceInterface() The DDS_ServiceInterface provides all the convenient methods to register/unregister services, run/shutdown them, and getting service proxies on the runtime-specific domain id, as does the DDS_Service class for the default domain. You can find a detailed description of that interface in the RMI API reference documentation. So, to use a non default DDS domain for RMI invocations, the developer should, first, get a CRuntime object for the wanted domain id as follows (in java): CRuntime runtime = CRuntime.getRuntime(domain_id); DDS_ServiceInterface dds_Service = defaultRuntime.getDDS_ServiceInterface(); then, dds_Service can be used to make the usual actions for client and server RMI applications. |
OSPL-5265 |
idlpp -j package prefix not applied to module-less types
idlpp package prefixes specified using the -j option were not applied to module-less types. Solution: The behaviour of idlpp changed in version 6.5 and prefixes are now properly applied. |
OSPL-5329 |
Windows installation path
On 64bit windows version a 64bit version is installed in C:\Program Files (x86) Solution: The 64 bit installation directory is changed to C:\Program Files\ |
OSPL-5424 |
Windows Registry Keys not cleared on uninstall
Registry key entries in HKEY_CURRENT_USER under Software -> PrismTech -> OpenSpliceDDS -> Version were not cleared when uninstalling. This was due to a bug in the uninstaller that prevented it completing. This could result in a number of entries in the HKEY_CURRENT_USER if multiple versions of OpenSplice were installed. Solution: The bug in the uninstaller has been fixed and uninstall now completes correctly. The Registry key in HKEY_CURRENT_USER is now cleared for any new installations but it will not clear historical entries. |
OSPL-5605 / 14105 |
Unable to attach shared memory in Java8 applications on 64-bit Windows
The OpenSplice shared-memory database needs to be mapped to the same address in all applications. On 32-bit and 64-bit Windows, the same default address is used. This caused problems with recent Java8 on 64-bit platforms, where the default address is already occupied by the JVM. Solution: The default address was changed to 0x100000000, to make the product work out-of-the-box on 64-bit Windows platforms that use a recent Java8 JVM. Note the address can be changed through the Domain/Database/Address configuration parameter when the default is unsuitable. |
OSPL-5678 |
Networking service crash when compiled with VS2012
The networking service encountered a VS2012 C Runtime bug that let it crash: https://connect.microsoft.com/VisualStudio/feedback/details/782889/ Solution: Don't call strftime() with "%Z" when compiled with VS2012. |
OSPL-5697 |
Declaring multiple RMI operations having the same name in different
interfaces lead to compilation errors of C++ generated code
Declaring two or more RMI operations with the same name in different IDL interfaces makes the compilation of the generated C++ proxy classes fails because of a duplicate typedef declaration. Solution: The RMI C++ code generator has been updated to declare the typedef in the scope of the related RMI interface proxy class instead of the global scope. |
OSPL-5789 / 14287 |
Wrong error code returned by ospl
The error codes, returned by ospl, did not reflect the information within the usage help text of ospl. Also, extended error codes and a small behaviour change were required as well. Solution: The normal ospl error codes have been improved. Also, the extended error codes will be returned when using the -e option. |
OSPL-5793 / 14284 |
Error missing field initializers in the OpenSpliceRMI header file
dds_service_sdk.h
gcc compilation could warn or fails (with -Werror=missing-field-initializers option) about missing filed initializers for member "DDS::ReliabilityQosPolicy::synchronous" in the RMI header file "include/rmi/dds_service_sdk.h" Solution: The missing field initialization has been added. |
OSPL-5832 / 14298 |
Termination issues while listener callback in progress.
The listener callback implementation contains in concurrency issues which were difficult to resolve due to the internal design limitations. Solution: The mechanism has been redesigned, moving control of the listener thread into the PSMs. |
OSPL-5844 |
DCPS Java 5 PSM does not use correct default domain id
The internal default domainId is 0x7fffffff. In this case this domain id is used, the OSPL_URI environment variable is inspected and the domain id in there is used. The Java 5 PSM always uses 0 as default domainId. As a result setting OSPL_URI i.c.w. a non-zero domainId did not allow applications to create a DomainParticipant. Solution: In case no domainId is provided on creation of the DomainParticipant, OpenSplice checks the domainId in the OSPL_URI and uses the domainId specified there. This is not always 0 as specified in the Java5 PSM as default id. As our default config files that we ship always use 0, this is deemed acceptable behaviour. |
OSPL-5848 / 14321 |
The durability service can crash i.c.w. dispose_all functionality
The durability service could crash due to a null-pointer dereference in an internal algorithm related to the dispose_all functionality. Solution: The internal algorithm has been modified to deal with dispose all properly as well. |
OSPL-5847 / 14320 |
Unable to read sample that complies to readcondition
Due to a locking problem in the product, a read call could return NO_DATA while actually there is data available Solution: The issue has been resolved by adding a lock while freeing internal data |
OSPL-5849 |
The Java5 DCPS implementation does not handle infinite duration
correctly in Waitset.waitForConditions calls.
The Java5 DCPS implementation internally converts durations to a different representation. In the Waitset.waitForConditions() operations the conversion algorithm was incorrect. Solution: Fixed the conversion routine to deal with infinite durations correctly. |
OSPL-5880 / 14330 |
Crash of RTSM tool
The RTSM tool fails to attach to shared memory and crashes on a segmentation violation due to a mismatch in the layout of internal data-structures. Solution: The issue was resolved and the code has been updated to prevent similar issues from occurring in the future |
OSPL-5893 / 14333 |
The durability service could crash when running out of shared memory
The durability service did not check in all cases whether the allocation of a sample in shared memory succeeded. As a result it could dereference a null-pointer and crash. Solution: The durability service did not check in all cases whether the allocation of a sample in shared memory succeeded. As a result it could dereference a null-pointer and crash. |
OSPL-5905 |
Selecting a network interface for DDSI2 using the network address
fails on Linux
The comparison of the specified address and the network interface address failed to take the network mask into account. Solution: Code fix applied. |
OSPL-5910 / 14339 |
A valgrind memcheck analysis of certain functionality reports:
Invalid read of size 4
The copy subscriber qos out function from the gapi subscriber is creating a normal string for the name in the qos. This should be a gapi object string, as it is also freed as an gapi object. Solution: The copy out function is now creating the correct gapi object for the string. |
OSPL-5913 |
Transport priority in built-in topics for remote writers always 0
when using DDSI
The DDSI2 service generates built-in topics for the remote entities it discovers, but failed to correctly set the transport priority QoS in the CMDataWriter topic, leaving it at the default value instead. Solution: DDSI2 now sets the transport priority. |
OSPL-5999 |
Tuner corrupts OSPL_URI when used with argument -uri=
When starting the Tuner with the argument -uri=$OSPL_URI the OSPL_URI would be overwritten by the tuner property file. Solution: The OSPL_URI is no longer overwritten. |
OSPL-6053/ 14398 |
Errors reported about lack of multicast capability by networking
service, when no multicast communication is configured
The networking service checks multicast support when a partition is added to it's internal administration. Even when the network-partitions are configured to use broadcast or unicast, the service still performs this check, leading to error reports when it determines multicast is not supported. Solution: The check was moved so it is only performed after determining the address is in fact a multicast address. |
OSPL-6069 / 6079 14409 / 14412 |
Missing RMI asynchronous replies in some cases and their memory cleanup
In case of an OpenSpliceRMI Java application that invokes multiple asynchronous calls intensively, some of the asynchronous replies could be lost. Furthermore, every asynchronous request is uselessly kept in memory, which increases dramatically the memory footprint of the RMI server. Solution: A bug in the asynchronous replies management of Java OpenSplice RMI has been fixed. The Java RMI generated code has also been updated consequently. So, existing java RMI applications should re-generate their code using rmipp. |
OSPL-6119 |
DDSI can spontaneously change best-effort max_blocking_time
DDSI tries to minimise the discovery traffic by leaving out all QoS's that are set to the default (this can be overridden using Compatibility/ExplicityPublishQosSetToDefault), relying on the peers' filling in the defaults again. The decision whether the reliability QoS was at the default value or not was not correctly accounting for the max_blocking_time part of it. Solution: Reliability QoS value now set correctly. |
OSPL-6134 |
Documentation links in RTS don't work
Documentation links in the RTS installer don't work as the target files are not available. Solution: The RTS should not include the documentation, so it has been removed. |
OSPL-6145 |
RnR replay crash
Whenever RnR replay had to wait for the resources to become available or for the networking services to be ready to accept data for a partition/topic combination introduced by the replay, RnR could crash. Solution: The issue has been fixed. |
OSPL-6160 |
RnR replay fails when topic definitions are provided by recording
RnR does not store/inject topics automatically, instead requiring the user to record and replay the DCPSTopic topic. Replaying a topic definition means that the topic becomes available once spliced has processed the sample, but this may be too late for the writer creation in RnR. Solution: RnR now creates the topics synchronously. |
OSPL-6207 |
Report plugin robust against null pointer exception
The report plugin could crash if it experienced a null pointer exception. Solution: The plugin is now robust against the exception. |
OSPL-5524 |
Java Throughput example exception when running a 2nd publisher
Due to an issue in the internal administration of the java subscriber-part of the throughput example, the subscriber would terminate with an exception in case the publisher is stopped and started for the second time. Solution: The algorithm to update internal administration has been fixed. |
Report ID. | Description |
---|---|
OSPL-6075 / 14410 |
Concurrency issue while freeing signal-handler administration
When a process detaches from the user-layer, which occurs automatically when a process terminates, it deinits the signal handler administration. Since it is still possible the process receives a signal at the same time, the signal handler thread may still be running and depending on the administration that is freed by the exit handler. A crash or mutex deadlock is often the result. Solution: The issue has been resolved by removing the possibility of the administration being freed while still in use. |
OSPL-6184 / 14467 |
Missing event on data reader view query.
Queries on data reader views could miss a trigger causing data not to be read. Solution: The trigger mechanism is corrected and the data reader views are now always correctly updated. |
OSPL-6297 / 14516 |
Find topic timing behaviour incorrect.
The find topic method on a DomainParticipant has a timeout parameter that specifies the maximum blocking time to wait for the topic to appear. In certain cases the method would block for 100 ms too long, which can have a noticeable impact on the application when called often on topics that don't yet exist. Solution: The implementation was changed to never exceed the maximum timeout. |
OSPL-6298 / 14515 |
Crash when closing OpenSplice rtsm with ctrl+c
The RTSM tool accesses the internal database to get information and statistics. When terminating RTSM with ctrl+c during such a period, then it'll corrupt the database and make the domain stop. Solution: The signal is now caught and the handler detaches the tool properly from the database before quiting when needed. |
OSPL-6350 / 14531 |
Deletion of entities while other threads are accessing causes a lot of exceptions
Deletion of entities while other threads are accessing fails as a result of a race condition between unlocking and deleting an Entity. Solution: Deletion of Entities while other threads are accessing the Entity is delayed until ongoing access has finished. |
TSTTOOL-184 / 14348 |
New instance not automatically displayed in tester
In Tester, if an application data writer has a QoS of autodispose_unregistered_instances set to false and then unregister_instance is called on a data writer for some instance, a sample reaches matching data readers with the no_writers state and is ignored. Solution: A new boolean option has been added in the Preferences menu under the Settings tab, called "Ignore not_alive_no_writers samples" and is set to true by default. If the option is set to false, then these specific samples will be displayed in Tester. |
TSTTOOL-192 / 14352 |
Ospltest gives errors on startup
The script directories are not included in the RTS installers. Solution: Updated scripts and install directories to fix these errors. |
Report ID. | Description |
---|---|
OSPL-6027 / 14397 |
Crash of DataWriter for multiple generations of one instance
When a DataWriter can't deliver messages because a peer has no resources to accept them, the writer will temporary store the messages in its history and try to deliver the messages at a later point in time. If for one instance multiple messages are 'delayed' for multiple generations of one instance then a crash may occur. This will only occur if a retry is able to deliver the first generation of the instance but not all messages of the newer generations. In this situation the system will detect that the instance of the first generation has ended and disconnect the writer unaware that more generations exist. As a consequence the DataWriter will crash when it tries to deliver the remaining messages. Solution: he solution to this problem is that the DataWriter actively reconnects when it successfully delivered a 'delayed' Unregister message but was not able to deliver all messages of newer generations. |
Report ID. | Description |
---|---|
OSPL-5880 / 14330 |
Crash of RTSM tool
The RTSM tool fails to attach to shared memory and crashes on a segmentation violation due to a mismatch in the layout of internal data-structures. Solution: The issue was resolved and the code has been updated to prevent similar issues from occurring in the future |
OSPL-5847 / 14320 |
Unable to read sample that complies to readcondition
Due to a locking problem in the product, a read call could return NO_DATA while actually there is data available Solution: The issue has been resolved by adding a lock while freeing internal data |
OSPL-5936 / 14137 |
Durability Service Alignment Improvement
When a node becomes master it requests samples from all the fellows. The master will request data for the groups that it knows about. Data for groups that not known to the master are aligned later using a different and potentially slower code path, resulting in less efficiency alignment. Solution: If the master should request samples from a fellow and it does not have received all groups from this fellow yet, then it will first request the groups of the fellow in order to know as many groups as possible before requesting samples. That will result in more efficient alignment. |
OSPL-6036 |
Incorrect behaviour of shared DataReaders
In case multiple shared datareaders are created (by setting the share QoS policy), in certain situations the internal administration could be freed while another shared reader still depends on it. This could lead to undefined behavior such as a crash, or even reading data of a different topic if the internal administration was reused for other readers on a busy system. Solution: The issue was resolved by fixing a bug related to refcounting |
Report ID. | Description |
---|---|
OSPL-5602 / OSPL-5676/ 14112 / 14137 |
Alignment of historical data intermittently fails in case multiple
master conflicts simultaneously appear.
When multiple durability services are started but communication between them is disabled they all operate in isolation. When suddenly communication between them is enabled multiple master conflicts appears. In an attempt to solve these conflicts alignment of data that was published takes place. When the volume of data that was published in isolation is large, the alignment can become massive. In some cases the alignment was flawed, leading to an incorrect end-state where different nodes have different views on the data that was published. Solution: The administration to keep track of alignment data has been changed, so that data of the same partition/topic from different durability services are not mixed anymore. Furthermore, the alignment procedure has been adapted which leads to a more efficient alignment scheme involving less alignment data. |
OSPL-5817/ 14293 |
RTNetworking service may crash on an interface status change
Due to a lock being released twice on the detection of a change in the status of a monitored network interface, the RTNetworking service could crash. Solution: The relevant locking has been revised and the double-unlock has been removed. |
OSPL-5818/ 14292 |
Starting ospl deamon from different user accounts could result
in a delay of 10 seconds.
When 2 different users are working on the same node and both are able to attach to the same domain, a user could experience a delay of 10 seconds when the domain was started. Also the process monitor was not working correctly in this use case. This was caused by connecting the same domain, but a different named communication socket. Solution: The name of the communication socket used by the process monitor is now consistent with the name of the key file in the tmp directory. |
OSPL-5827/ 14297 |
Regular reports leading to a large info log
Some info reports that don't convey any particularly interesting information are logged on application start. In a situation where applications are frequently (re)started this could quickly lead to a large info log file. Solution: The info reports about ignored signals and initialization of the user-clock-module have been removed |
OSPL-5831/ 14296 |
Unable to launch OpenSplice Tuner after installing RTS
The script to launch the Tuner depends on another script, which wasn't included in the RTS but only in HDE installers. Trying to start the tuner triggers an error 'ospljre: command not found'. Solution: The RTS installer now includes the ospljre script so the tuner can be launched. |
OSPL-5835/ 14299 |
Crash of durability service during termination
In specific circumstances, when durability is terminated while it is resolving the master of a namespace, the service could fail on a mutex lock that was already freed, and would still try to unlock it which resulted in undefined behaviour and potential crash of the service. Solution: The termination mechanism was revised to free internal administration in the correct order and only unlock the mutex when the lock was successful |
OSPL-5851/ 14295 |
Error log file is created if buildin topics are disabled
When builtin topics are disabled an error log is created with the error: DataReader (name="DCPSParticipantReader") not created: Could not locate topic with name "DCPSParticipant" Solution: The DCPSParticipantReader is not created anymore if the builtin topics are disabled. |
OSPL-5854/ 14323 |
Lack of reporting when incompatible meta-descriptor is registered.
When type-support is registered by an application for a type that's already known, the declaration needs to match the existing declaration. When this is not the case, the registration fails and an indescriptive error is logged by the serializer. Solution: A report was added that refers to a declaration mismatch and also mentions the incompatible part, so the user can find and correct the corresponding IDL declaration. |
Report ID. | Description |
---|---|
OSPL-5673/ 14138 |
On VxWorks RTP large delays may occur, even for threads running
at the highest priority.
Priority inversion is the phenomenon where a high priority threads runs into a lock that is taken by a low priority thread, the high priority thread will not proceed until the lock is freed by the low priority thread. The remedy to deal with priority inversion is priority inheritance. Priority inheritance temporarily increases the priority of the low priority thread until the lock is freed, thus allowing the high priority thread to proceed. Although VxWorks provides native support for priority inheritance, OpenSplice did not benefit from it. This could cause large delays, even in threads running on the highest priority. Solution: Priority inheritance for mutexes (which are used to implement locks) is now enabled in OpenSplice for VxWorks RTP. Priority inheritance can be enabled by setting OpenSplice/Domain/PriorityInheritance in the configuration. Note that no priority inheritance for condition variables is supported. |
Report ID. | Description |
---|---|
OSPL-5219/ 13337 |
When the system is terminated while types are being registered a
crash can occur.
When a system is started, types that have been specified in the idl specification are being registered. If during the registration of these types the system is terminated a crash may occur. The crash is caused because the registration process is using references to type definitions that may have been deleted already by the termination thread. Solution: References to type definitions are now properly protected so that it is not possible anymore to delete references to types that are still in use by another thread. |
OSPL-5742/ 14164 |
Stale information in ospl artifact file causing ospl to exceed
ServiceTerminatePeriod
When the ospl tool is used in blocking mode (-f option) the artefact file is not properly (un)locked and updated under all conditions. Stale administration data could cause the ospl tool to exceed the ServiceTerminatePeriod when stopping a domain or report an incorrect warning when starting the domain. Solution: The issue was resolved by making sure the artefact file is properly managed in blocking mode. |
OSPL-5770 |
RnR may cause a warning by spliced about resources on termination
The RnR service doesn't properly clean-up one of the writers it uses. This causes the safety mechanism of spliced to kick in after RnR has terminated. Solution: The RnR service now properly cleans up the writer. |
OSPL-5772 |
NetworkingBridge may cause a warning by spliced about resources
on termination
The NetworkingBridge doesn't properly inform spliced that it has terminated. This causes the safety mechanism of spliced to kick in. Because the NetworkingBridge actually did clean up its resources, spliced can always successfully clean up after the NetworkingBridge Solution: The NetworkingBridge now properly informs spliced, so that the clean up routines and the superfluous report don't occur anymore. |
Report ID. | Description |
---|---|
OSPL-5083 |
CMPartiticipant built-in topic extended with federation and vendor ids
The existing CMParticipant built-in topic needs to be extended with federation and vendor ids. Solution: The content of the CMParticipant built-in topic has been extended to include a string that may be used as a federation identifier. For Vortex Cafe, each process is considered a federation. For other vendors' products, the federation id is based on our current understanding of the identifiers used by them, and this may change as our understanding grows. Also included is the vendor id code assigned by the OMG to the various vendors for use in the DDSI protocol, thus allowing tooling to show the vendor or use vendor-specific knowledge. The vendor code consists of two unsigned integers separated by a decimal point. (The vendor code for OpenSplice Enterprise is "1.2".) |
OSPL-5357 |
Ignoring all topics in _BUILT-IN PARTITION_ in DDSI2E breaks all communication
DDSI2E internally relies on a topic in the built-in partition, but failed to note the presence of this topic/partition when ignored. While it is possible to ignore just this topic/partition, in practice, it is most likely to happen when ignoring all topics in this partition. A work-around is to configure topics C* and D* in this partition, as this does not include this particular topic. Solution: The detection of the presence of this topic/partition has been updated. |
OSPL-5430 / 13801 |
Reference to OMG ISO C++ specification missing in documentation
There is no reference in the ISO C++ PSM documentation to the OMG ISO C++ PSM specification. Solution: A link to the OMG spec has been added. |
OSPL-5688 |
Globally unique systemId needs to be generated with more care.
Each federation generates its own id at start-up, which must be unique in the system. Sometimes id's could turn out to be the same causing undefined behaviour. Solution: Unique system id generation has been improved to prevent duplicates when two copies of opensplice are started simulateously on the same linux or windows node. |
OSPL-5694/ 14141 |
Insufficient checking in java native marshalling routines
When a sample with uninitialized members of type union or enum is written in the Java PSM, the JVM may crash instead of receiving a proper error return code. Note that members are always initialized by default in the code generated by idlpp, but it is possible for the application to assign null to a member after initialization. Solution: The marshalling routines were changed to return a BAD_PARAMETER code when an uninitialized member is processed. |
OSPL-5740 / 14163 |
An incorrect error is logged when a library fails to load
An incorrect error is logged when a library, such as a report-plugin, fails to load library names. Libraries (i.e. the report plugin), can be entered in the configuration file in a platform agnostic manner. OpenSplice will translate the name and when the library fails to load runs a fall-back mechanism to load the original name. In this process, details on the failure were lost. Solution: The product has been changed to record a proper error message to the OpenSplice error log when a library fails to load. |
OSPL-5743 / 14165 |
The durability service could crash in case a namespace to a
fellow is added for which no aligner exists.
When a durability service receives a namespace for a fellow it adds the namespace for this fellow to the internal administration. Part of this administration is the merge state of the namespace. When no aligner for the namespace is known the merge state is NULL. Due to a bug setting a NULL value for the merge state would lead to a crash. Solution: The code that deals with setting merge states of namespaces has been changed so that no crash occurs anymore. |
OSPL-5744 |
Classic Java PSM QosProvider get_participant_qos() may crash
When using the Java QosProvider in combination with get_participant_qos with a non null id the JVM could crash. Solution: The problem in get_participant_qos is now fixed and the JVM will not crash anymore on a non null id. |
OSPL-5745 |
autopurge_disposed_samples_delay zero is not instantaneous
When autopurge_disposed_samples_delay is zero, then the purge is not instantaneous. It will be purged only after the monotonic clock has progressed at least one tick. Solution: This is solved by changing a timing check in the purge handling from 'larger than' to 'equal or larger than'. |
Report ID. | Description |
---|---|
OSPL-5616 |
Added support for shared library builds on vxworks RTP
Shared library support required for VxWorks RTP use on Pentium4 and E500V2. Solution: Added shared library support for VxWorks RTP. Due to symbol table restrictions with the GNU toolchain the ddskernel library has been split into ddskernel and ddskernel2 for PPC Shared libraries. |
Report ID. | Description |
---|---|
OSPL-1167/ 10824 |
The ospl tool has wrong exit and status codes
Regarding the status of a domain, the ospl tool returns wrong exit codes and depicts wrong status codes when listing domains. Solution: The ospl tool has been extended with a status file that contains the states of the available domains. |
OSPL-5374/ 13672 |
Issues with RT Networking CPU usage when Record and Replay service
is enabled.
An issue in the Record and Replay service in certain circumstances could result in native networking using up all of the cpu resources. When a recording is stopped, Record and Replay stops reading samples matching the record interest expressions, but networking continues to deliver these samples until storage resources are exhausted. When exhausted, networking anticipates on resources becoming available again, and continues attempted delivery at an increased rate, resulting in cpu exhaustion. Solution: A bug was found and resolved so record interest is properly disposed of by Record and Replay, after which networking stops delivering samples that are never read by Record and Replay. |
OSPL-5615/ 14114 |
The networking service may crash when topics with a name exceeding
64 bits are used.
The networking service uses an internal buffer to store the topic names associated with received messages. Initially this buffer is 64 bits wide. When a topic name larger than 64bit is received the buffer should be increased in size accordingly. However it may occur that not enough memory is allocated which causes memory corruption to occur. Solution: The issue is fixed by always allocating enough buffer space when a topic name is received which has a size larger than the current buffer size. |
OSPL-5387/ 13669 |
Non-default presentation QoS incorrectly refused by product.
The middleware did not accept a publisher or subscriber QoS on which the presentation was set to instance scope with ordered_access enabled. This caused inter-operability issues with other DDS vendors, while in fact the implementation does by default support ordered access on instance level. Solution: The restriction has been lifted so that publishers and subscribers can now be created with enabled ordered_access setting, as long as the scope is set to instance. |
OSPL-5482/ 14005 |
Crash of ddsi service when a DataReader with SubscriptionKey QoS
policy is used.
Management of builtin topics by the DDSI service contained a bug that could potentially crash the service when a builtin-topic sample is created for a DataReader that has the (OpenSplice-specific) subscription key QoS policy. Solution: The implementation was fixed to correctly handle the subscription key policy. |
OSPL-5552 |
OSPL_HOME may not be set correctly when using an archived build
Builds delivered in an archived format would still contain the installer macros to be expanded at install time, without the installer these macros would remain and cause the release.com to set an invalid OSPL_HOME. Solution: The release.com now attempts to set OSPL_HOME using Bash when not using an installer. For users without Bash, a message will be presented expecting them to manually adapt the release.com with a valid OSPL_HOME. |
OSPL-5627/ 14118 |
Possible crash after exception handler has cleaned up resources.
There was an issue with the exception handler cleaning up used resources which where still in use by the lease and resend managers. Solution: Before the exception handler frees the used resources, first stop the lease manager and resend manager. |
OSPL-5657/ 14126 |
Bounds checking error on IDL sequences with #pragma stac
When #pragma stac is applied to all members of a struct, it would also be applied to sequences in case the sequence contains strings (or a type that resolves to string). However the code generated by idlpp would not correctly handle this, leading to errors when a sample is published and bounds checking is enabled. Solution: There is no real performance benefit in applying stac transformation to sequence elements so the pragma is now ignored for struct members of type sequence. |
OSPL-5660/ 14129 |
The durability service could crash when the service is terminating
When the durability service is terminating it tries to clean up its resources. One of these resources is the fellow administration. When a fellow is being removed from the administration because it failed to updated its lease in time while at the same time durability is terminating, it is possible that durability tries to reference a fellow that has already been freed. This leads to a crash. Solution: References to fellows are properly counted, and the fellow is only freed when no other threads keep a reference to the fellow object. |
OSPL-5675/ 14139 |
Open dynamic loaded libraries with RTLD_NOW.
Dynamic loaded libraries were opened with RTLD_LAZY, which caused problems with external library loading (such as report libraries). Solution: Dynamic loaded libraries are now opened with RTLD_NOW. |
OSPL-5693 |
Durability servers that operate in different roles and detect
conflicting states for a namespace might handle the conflict wrongly,
possibly resulting in a crash of durability.
When different durability servers take responsibility of the same namespace but for different roles, merge policies can be applied to resolve these conflicts. Due to an error in the condition to resolve the conflict it is possible that a different merge policy is applied than specified. Also, when the correct merge policy is applied a crash could occur due to an attempt to access an object that has already been freed. Solution: The condition to resolve the conflict has been changed s that the correct merge policies are triggered. Also, the crash has been prevented by properly refcounting the object. |
Report ID. | Description |
---|---|
OSPL-4944 |
ISO C++ documentation improved
Solution: Resolved problems with API being described, added code samples, added new API descriptions. |
OSPL-5194 |
VxWorks RTP version now has descriptive thread names
Solution: Added descriptive names to the OpenSplice threads on the VxWorks RTP builds. |
OSPL-5338/ 13592 |
Writing samples from the DCPS Java API can result in an overflow
of the internal references table of the JVM.
During a write, a lot of Java object references can be created, depending on the type of a topic. Though the JNI specification allows only 16 references, in practice there were never any issues on Oracle JVM with using more references. Therefore the product did not explicitly delete references in favor of performance benefits during the write call. The PERC JVM however does not allow this relaxation of the JNI spec. overflowing the references table results in memory corruption. Solution: The Java PSM (JNI layer) was changed, to free unused references so the table cannot overflow. |
OSPL-5615/ 14114 |
The networking service may crash when topics with a name exceeding
64 bits are used.
The networking service uses an internal buffer to store the topic names associated with received messages. Initially this buffer is 64 bits wide. When a topic name larger than 64bit is received the buffer should be increased in size accordingly. However it may occur that not enough memory is allocated which causes memory corruption to occur. Solution: The issue is fixed by always allocating enough buffer space when a topic name is received which has a size larger than the current buffer size. |
Report ID. | Description |
---|---|
OSPL-5220/ 13333 |
Unfair claim of ownership by unregister message
Unregister messages could claim ownership of an instance and in combination with the deadline QoS and liveliness lost, this caused data reception 'gaps' when another writer with lower strength had already taken over. Solution: Unregister messages will not claim ownership. |
OSPL-5228/ 13336 |
When the durability service terminates there is a possibility that
the durability service crashes
When the durability service terminates it will clean up all its resources. While doing so there is a possibility that the action queue is already destroyed while another thread still tries to access the action queue. This situation could lead to a crash. Solution: Before cleaning up most threads are stopped. This prevents that a thread accesses a piece of memory that has been freed by another process. Also, the order to cleanup resources has been changed so that the action queue is destroyed AFTER all threads that may use it are stopped. And finally initialization and deinitialization of objects in the durability service has been improved. |
OSPL-5361/ 13599 |
Difficulty determining if a Record and Replay service is finished
replaying samples
In case all samples in a storage were replayed, the Record and Replay service would continue to poll the storage in case new samples were recorded, in order to replay them. This meant the storage remained open and this 'polling state' was not discernible by monitoring the (storage) status topic. Solution: The behavior was changed to only enter the polling state in case a storage is used for recording as well, at the time the last sample is replayed. In case a storage is not used for recording, all replay-interest is removed, the storage is closed and a corresponding storage-status sample is published. |
OSPL-5419/ 13594 |
Liveliness detection and synchronization problem.
When disconnecting a node, then the liveliness changed is not always triggered on the remaining node when using exclusive ownership. When the liveliness changed is triggered, then the instance state is (often) still 'alive' when 'not alive' is expected. Solution: Messages of low strength writers in a exclusive ownership setup are not handled. By not ignoring the 'unregister message of such a low strength writer, the liveliness is properly decreased. Also the the liveliness changed is now triggered after the related instance states have been set to 'not alive'. |
OSPL-5459/ 13924 |
With a (default) umask of 0022 different users on the same node
interact with each other on a posix system.
When 2 different users are working on the same node with a umask setting of 0022 the splice deamon will attach to the same shared memory segment. This is caused by the key file with user rights set to 666 with the key for the shared memory to attach to. Solution: If the umask gives only read or write rights to a user/group/others then the key file gets no rights at all for that part. This will result in a key file with user rights 600 on a default umask of 0022. |
OSPL-5468/ 13933 |
Customer code application build problem with 6.4.2
netdb.h system header file clashed with a symbol when building customer code. Solution: Avoided an issue with conflicting symbols by not including the netdb.h system header file when building customer code. |
Report ID. | Description |
---|---|
OSPL-5365/ 12971 |
Fixed startup failure with DDSI if configured to run using a single UDP unicast port
DDSI would not start when configured to use a single UDP unicast port. Solution: Behaviour fixed. |
OSPL-5413 |
When using DDSI, the "dispose all" command was transmitted
best-effort
The QoS used for publishing a "dispose all" command throughout the domain caused it to be sent best-effort when using DDSI, creating the possibility of it reaching only a subset of the nodes. The different design of RT networking ensured that systems based o RT networking did not run this risk. Solution: The QoS has been changed to reliable. |
OSPL-5416 |
The DCPSHeartbeat writers should be best-effort
For the OpenSplice-specific DCPSHeartbeat built-in topic, best-effort relability suffices. In OSPL V6.4.1 version it was changed to a reliable writer, which slightly affects behaviour when used with the RT networking protocol, in cases where the network is overloaded or very unreliable, as delivery of the DCPSHeartbeat may be blocked by preceding messages. Solution: The DCPSHeartbeat writer QoS is once again best-effort |
OSPL-5450 |
Deserialiser can incorrectly reject valid input because of an erroneous bounds check
An issue in the CDR deserialiser can cause a valid input to be rejected by a sequence bounds check. The affected components are DDSI, durability with a KV persistent store, RnR with binary storage, and RT networking when using compression (not the legacy compression). Solution: The check has been fixed. |
OSPL-5460 |
Potential misaligned access in the CDR deserialiser for 64-bit objects
The CDR deserialiser could access 64-bit objects without ensuring the access is properly aligned. On most platforms, and most notably on the x86 and x64 platforms, such misaligned accesses are entirely legal, but on some platforms, they cause a misaligned access exception. Solution: The CDR deserialiser now avoids misaligned accesses. |
Report ID. | Description |
---|---|
OSPL-3957 |
OpenSplice should support hibernation
Modern platforms support the concept of hibernation and resuming. When hibernating, all processes are suspended and the complete RAM is written to permanent storage and during resuming that information is written back into RAM again with the purpose to have all processes continue where they left off before hibernation. However, during the period a system is hibernated, time elapses and as a result software may face time jumps when resuming again. OpenSplice is not able to cope with these time jumps, resulting in the potential termination of services or even a complete shut-down of the middleware. OpenSplice needs to be able to cope with hibernation to allow the product to be used in environments that rely on that functionality as well. Solution: The various notions of time have been updated throughout the entire product allowing it to cope with time jumps as well as resuming after hibernate/suspend. |
OSPL-4196 |
Timestamps on WinCE aren't guaranteed to be represented in UTC
OpenSplice internally uses the WinCE GetLocalTime() operation, which returns time in local time-zone. Depending on time-settings of the operating system, this time may not match UTC, which is used on other platforms. Solution: The implementation now ensures the time is represented in UTC on WinCE as well. |
OSPL-4325 |
Error log about failure to remove DBF file on Windows after domain shutdown
An issue with termination of OpenSplice services could result in a failure to remove the database (.DBF) file and corresponding message logged to the ospl-error logfile. Solution: The termination issue was resolved so that the database-file can be removed. |
OSPL-4513/ OSPL-5285/ 12875/13407 |
Shared Memory not detecting terminated or killed processes
Terminated or killed processes on Windows are not detected, which may lead to corrupt shared memory and will then not update the liveliness state of writers. Solution: Updated the implementation of the shared memory monitor for Windows, using specific Windows API calls. Now the termination of a process is detected by the Splice daemon and proper action is taken to clean up after termination of the process. This might lead to a shutdown of Splice in case the terminated process was modifying shared memory when terminated, as shared memory is corrupted in this case. |
OSPL-4590/ 12947 |
When durability terminates, the durability service should try to
persist as much data as possible in case the persistent data queue
still contains samples to persist.
Until now a durability service that terminates does not store persistent data that is waiting to be persisted. An improvement to this behavior is to try and persist as much of the remaining data as possible without exceeding the ServiceTerminatePeriod. Persisting as much data as possible is a best-effort attempt to save valuable data in case a durability service is terminated. Solution: The durability service uses part of the ServiceTerminatePeriod to store remaining data that has not yet been stored at the time the durability service starts to terminate. |
OSPL-4695/OSPL-5262/ 13414 |
RMI: Interface unregistration problem
When an interface is unregistered then the runtime is shutdown, a NullPointerException was raised, and any attempt to register this same interface a second time fails and leads to "Interface X already registered". Solution: The interface is now properly removed from the runtime interface registry. |
OSPL-4696 |
Restarting an RMI runtime causes failure
On Runtime stop, in CInterfaceRegistryImpl.java, the m_Reader reader is closed (directly or indirectly), but the field is not set to null. So after restart, the CInterfaceRegistry tries to resuse the reader with no success. Solution: Fixed the m_Reader in clear method. |
OSPL-4871 |
Reference guide update for WriterDataLifecycleQosPolicy
WriterDataLifecycleQosPolicy missing autounregister_instance_delay and autopurge_suspended_samples_delay attributes. Solution: All reference guides have been updated with the descriptions. |
OSPL-4960 |
When Node Monitor is started it should publish all the enabled samples immediately
When Node Monitor is started, the NodeStats and NodeInfoConfig writers should publish all the enabled samples immediately and have their DURABILITY QoS set to non-VOLATILE so that late-joiner DataReader configured also to non-VOLATILE to get the last sample per key, rather than wait (potentially for a long time) until all the intervals have elapsed after nodemon startup. Solution: The NodeStats and NodeInfoConfig topics' durability QoS policy kind was changed from V_DURABILITY_VOLATILE to V_DURABILITY_TRANSIENT. |
OSPL-5146/ 13108 |
Not all signals properly handled when using pthread_kill(...)
When using pthread_kill(...), for example to abort a process, the process would continue to run. Solution: The signalhandler has been modified so that signals generated with pthread_kill(...) are properly handled too. |
OSPL-5147 |
User should be aware that a runtime installation of OpenSSL is
required for OpenSplice licensed features and/or ddsi2e and snetworking -
an update
Addition of TLS in ddsi2 removes the static link to OpenSSL in previous versions of OpenSplice on non-windows systems. Solution: The requirement for the OpenSSL runtime now only applies to ddsi2e and snetworking on non-windows systems. |
OSPL-5190 13188 |
Disabling the Java shutdownHook
Solution: The Java shutdownhook, used to clean up all created entities that have been created by a java application, but have not been cleaned up during the execution of the application can be disabled by a new introduced system environment property: "osplNoDcpsShutdownHook". See java reference manual for more details. |
OSPL-5244/ 13406 |
Waitset associated with wrong DomainParticipant causes problems during clean-up
In case an application has multiple DomainParticipants participating in the same Domain and attaches a Status-, Read- or QueryCondition to a Waitset, the Waitset may be associated with the wrong DomainParticipant, because the algorithm to select one did it based on the DomainId instead of looking at the DomainParticipant associated with the Condition. This may cause problems when deleting one or both DomainParticipants and even lead to a crash of the application in some cases. Solution: The DomainParticipant is now selected based on the DomainParticipant associated with the Condition. |
OSPL-5266 |
DDSI2 warns about a message without payload
DDSI2 used to emit warnings of the form "write without proper payload (data_smhdr_flags 0x2 size 0)" when receiving messages from a writer that have no content whatsoever. Such messages are allowed by the specification and hence should not result in a warning. Solution: The warning has been removed, it is still logged in the trace. |
OSPL-5268 |
C-language binding used wrong type for the "subscription_keys" of
the CMDataReader built-in topic
The C language binding accidentally used a sequence of strings where it should have been a single string to describe the subscriber-defined keys. This caused crashes for a C program trying to use the CMDataReader built-in topic, and additionally caused the C binding to differ from the other language bindings. Solution: The type has been changed. |
OSPL-5321 |
Issue with DSCP SAC code-generation
The code-generation templates for typed DataReaders and DataWriters contained an error in the definition of 'FooDataReader_get_subscription_matched_status' and 'FooDataWriter_get_publication_matched_status' methods. Since these methods were not available users were forced to use the regular DDS_-prefixed methods, resulting in inconsistent code. Solution: The issue has been solved and the correct definitions are now generated by idlpp |
OSPL-5335 |
DDSI can flag dispose/unregister messages from Caf� for CM topics as invalid
DDSI2 is designed to handle data containined "proper" payload, but some DDSI implementations in some cases do not provide a real payload, but only an alternative form of the key. DDSI2 translates these to well-formed payloads before interpreting them, but this translation was incorrect for some CM topics. Solution: The translation table has updated to cover all cases. |
OSPL-5406/ 13796 |
Unable to find an entry point in C# API
When using a call like GetDiscoveredParticipants in the C# API, an "Unable to find an entry point named 'u_instanceHandleNew' in DLL 'ddsuser'." exception could be thrown. Solution: This issue has now been fixed by referring the entry point to the correct library. |
OSPL-5411 |
RT networking interoperability fix due to ospl-4345 (6.4.1) fix
At 6.4.1, RT networking interoperability with older versions was degraded. Solution: RT networking now correctly handle version prior to 6.4.1. |
Report ID. | Description |
---|---|
OSPL-3216 |
Built-in CMParticipant Topic accessibility.
The CMParticipant built-in Topic should be accessible, to applications, through a built-in DataReader. Solution: The CMParticipant built-in Topic and DataReader are added to all language bindings. |
OSPL-4157/12727 |
C DCPS generated API doesn't compile with g++
The C API no longer compiles when using g++ Solution: The fault has been fixed and g++ will now compile the C API again. |
OSPL-4480/ 12870 |
Upon repeated stop/start of OpenSplice, the application received
the error "Max connected Domains (127) reached"
In our user layer the connected domains are stored in an array of 127 items. When a domain is connected, the next entry is used, skipping entries that have been freed by a domain disconnect. After 127 connects, the end of the array is reached, which results in this error. Solution: When a domain is disconnected, the entry in the array will be marked free. When a new domain is connected, the array will be searched from the beginning for a free entry, re-using locations that were in used and were freed. |
OSPL-5192/ 13191 |
When the replace merge policy is invoked not all data is being replaced.
When a fellow durability service that acts as master for a set of namespace leaves, a check is needed to see if an alternative aligner is available for each of these namespaces. If no alternative aligner is found the merge state for the namespace must be cleared to ensure that a merge action is triggered as soon as a new aligner joins the network. This check should be carried out for all namespaces. However, once no alternative for a namespace was detected the other namespaces were (wrongly) not checked anymore, resulting in the fact that the merge states for these namespaces are not cleared, and no merge policy is triggered for these namespaces. Solution: The code has been changed so that an alternative aligner is searched for all namespaces of the leaving fellow. |
OSPL-5197 |
CMSoap service can fail to terminate cleanly
The timeout handling in accepting new conditions in the cmsoap service could cause cmsoap to fail to terminate cleanly, instead automatically killing itself after the (configurable) service termination period. Solution: The timeout specification has been updated to avoid this issue |
OSPL-5202/ 13197 |
Tuner does not accept character code input for c_char fields.
If a user wanted to write a character value for a c_char field that was not found on the keyboard (like the NUL character or the LF character) there was no way to input it in the Tuner writer window. Solution: The Tuner writer input for c_char fields now accepts octal character codes (eg. \000 for NUL, \012 for LF, \141 for 'a', etc). |
OSPL-5234 |
Possible crash of networking service when running out of DefragBuffers
The networking service could crash when it ran out of DefragBuffers and its garbage collector started releasing DefragBuffers. The crash happened when the garbage collector double released DefragBuffers that were still in use. Solution: The garbage collector no longer double releases in use DefragBuffers. |
OSPL-5239 |
DDSI could crash when a thread is killed
When a thread was killed, where the participant was already deleted DDSI could crash trying to get subscriber/publisher out of this participant. Solution: Before trying to access the subscribers or publisher, check if the participant is still valid. |
OSPL-5251/ 13405 |
Using synchronous reliability could cause a crash due to a memory corruption
Due to missing locking on a part of the synchronous reliability administration, the memory could become corrupted causing a crash Solution: Locking has been added to the relevant bits of the synchronous reliability administration. |
OSPL-5253 |
Recompilation rules broken for Java applications
Java applications generated before V6.4.0p5 wouldn't run on later versions and required code to be regenerated and compiled. This was mentioned in the release notes (OSPL-4333). Solution: An overloaded constructor has been added that supports the old format of generated code, allowing the applications to be used according the the recompilation rules. |
Report ID. | Description |
---|---|
OSPL-4614 / 12392 |
In some situations it is possible that a durability service
processes pending messages from another durability service
(a fellow) that has been removed recently, causing the fellow to
be added again.
When a durability service is busy it cannot process any incoming message from a fellow immediately. In case a message from a fellow is received and the fellow is terminated before the message is processed, then the fellow will be removed as peer from the durability service. But when the pending message is processed the durability service notices that this came from a fellow that is not known, and will (wrongly) added again. Solution: When a fellow has terminated, its address will be remembered for some while. Messages originating from a fellow with an address that has recently terminated will not processed, thereby preventing that a 'new' fellow is added. |
OSPL-4985 / 13053 |
When multiple master conflicts appear at the same time, only one
of them is handled, resulting in incorrect merges of historical
data.
In some cases it is possible that suddenly multiple nodes appear that are all master for the same namespace. This is for example the case when a firewall initially blocks traffic between 3 nodes A, B and C that all become master. When the firewall is disabled (thus enabling communication between all nodes) each nodes has 2 possible master conflicts. Both of these conflicts should be handled to ensure that data is correctly merged. Solution: No master conflicts are dropped anymore, and successive master conflicts are being re-evaluated because they may have been invalidated by resolving previous master conflicts. |
OSPL-5125 |
Termination of DDSI2 in shared memory deployments on Windows
causes warnings
DDSI2 creates various objects for its internal administration and its interaction with the OpenSplice kernel. Some of these were not released in the terminated path, causing the DDSI2 domain participant to not be deleted at the expected time because there were unexpected outstanding references to it. This would then lead to an apparent crash of DDSI2, which would be logged. The problem surfaced only on Windows because of the differences in the way atexit() is handled on Windows and on other platforms. On the other platforms, the domain participant would eventually be deleted properly for a clean shutdown. Solution: All objects are now released explicitly and the DDSI2 domain participant is deleted as planned on all platforms. |
Report ID. | Description |
---|---|
OSPL-5152 |
DDSI TCP interoperability with Vortex Cloud Routing Service
When using OSPL clients with the Vortex Cloud where the routing service is involved then OSPL would fail to connect. Solution: TCP DDSI sends correct ENTITY_ID sub message to discovery and routing services, enabling cloud based routing |
Report ID. | Description |
---|---|
OSPL-4583/4586/ 12946 |
When durability is used in combination with DDSI and DDSI generates
builtin topics, then durability may not align data because DDSI may
drop durability messages.
A durability service assumes reliable two-way communication with other durability services. This assumption is not true anymore when DDSI generates builtin topics (see the <GenerateBuiltingTopics>-tag). In this case it is possible that a durability service receives messages from another durability service, but responses to these messages are dropped by DDSI because not all readers have been discovered yet. If responses are dropped the durability protocol is broken, possibly leading to the inability to align data. Solution: If DDSI generates builtin topics then the durability service will only respond to a nameSpacesRequest message if all readers of the remote durability service have been discovered. Because the durability protocol always starts with the exchange of namespaces, discovery of all remote readers is now guaranteed. |
OSPL-5068 |
release.com/.bat override OSPL_URI
If an OSPL_URI is set and then the release.com/.bat is executed, the OSPL_URI is over-ridden. Solution: OSPL_URI is now honoured. |
OSPL-5130 |
DDSI2 did not terminate within ServiceTerminatePeriod
The DDSI2 service did not terminate within ServiceTerminatePeriod. The listen_thread was blocking on 'accept()' call. Solution: On termination wake the listen_thread so termination can continue. |
OSPL-5141 |
DDSI2 TCP and TCP with SSL not consistent on blocking read/write
DDSI2 TCP and TCP with SSL not consistent on blocking read/write leading to inconsisten behaviour. Solution: A common timeout mechanism has been implemented for both TCP and SSL read and write operations that would block. TCP configuration options "ReadTimeout" and "WriteTimeout" have been added, these specify the timeout on a blocking read or write call, after which the call will be abandoned and the corresponding transport connection closed. These configuration options replace "ReadRetry", "ReadRetryInterval" and "WriteRetry" which have been removed. |
OSPL-5143/ 13107 |
Possible compiler warning in the C++ language binding
An initialiser used in the initialisation of a number of mutexes internal to the C++ language binding can cause compiler warnings. Solution: The initialisation has been modified. |
Report ID. | Description |
---|---|
OSPL-4983-1 |
DDSI over TCP interoperating with Vortex Caf\E9 may drop connection after 10s
When a Vortex Caf\E9 process was connected to an OpenSplice Enterprise node using TCP where Caf\E9 was acting as a TCP client and Enterprise as a TCP server, Caf\E9 could consider the Enterprise node dead because it was not receiving participant discovery data as it expects. Solution: Participant discovery data is now properly distributed over TCP connections. |
Report ID. | Description |
---|---|
OSPL-4983 |
DDS using TCP can not handle high data load
Under high load, DDSI TCP connections could be dropped and recreated, losing samples in the process and with various long timeouts interfering with the correct operation. This was caused by incorrectly handling a full socket buffer at the start of writing a message. Solution: The code now correctly accounts for such events. |
OSPL-5118 |
Performance improvement for CDR deserialisation
The CDR deserialisers all used a sub-optimal way of allocating sequences. Especially for sequences in small samples sent at a very high rate, this had a significant performance impact. Solution: The allocation of sequences has now been changed to use a faster method. |
Report ID. | Description |
---|---|
OSPL-5104 |
DDSI doesn't properly account transient-local data in its WHC
DDSI stores unacknowledged samples and acknowledged but transient-local samples in a writer history cache. The amount of unacknowledged data is kept track of, but this was done incorrectly for transient-local data. This in turn could cause DDSI to lock up, in particular when a large amount of discovery data needed to be sent. Solution: The amount of outstanding unacknowledged data now reflects acknowledgements of transient-local data as well. |
OSPL-5105 |
DDSI can throttle a writer without ensuring an ACK is requested immediately
When the amount of outstanding unacknowledged data in a writer reaches a (configurable) threshold, the throttling mechanism blocks further data from that writer from being sent until the amount of unacknowledged data is reduced to below a (configurable) level. This requires the readers to send acknowledgements, which they are only allowed to do upon request from the writer. The last packet sent before the throttling often, but not always, includes a request for acknowledgements, and if it doesn't a 100ms delay is incurred. Solution: The writer now forces out a request for acknowledgements before blocking. |
OSPL-5106 |
DDSI delays NACKs unnecesarily when requesting samples for the first time
DDSI2 follows the specification in delaying NACKs a little but only if the previous NACK was within the NackDelay interval. However, if it detects a need to request a retransmission of samples not covered in the previous NACK, waiting only introduces an unnnecessary delay. Solution: The NACK scheduling now takes into account the highest sequence number in the preceding NACK. |
Report ID. | Description |
---|---|
OSPL-5087/ 13087 |
High memory by DDSI2 on WinCE.
On WinCE, DDSI2 could require large amounts of memory when transmitting large samples because of an issue in platform-specific code. Solution: The underlying issue has been addressed. |
OSPL-4506/ 12880 |
The use of a content-filter topic causes a memory leakage.
When creating a content-filter topic a memory leakage occurs when evaluation the filter expression. The key field list used when evaluating the filter expression is not released causing the memory leak. Solution: Release the key field list when evaluating the content-filter. |
OSPL-4878/ 13001 |
DeadlineQosPolicy when DataWriter is deleted keeps triggering
Reader listener/waitset keeps getting triggered for deadline missed after instance is disposed and writer deleted. A v_dataReaderInstance was unintentionally re-inserted in the deadline list right after it was intentionally removed. Solution: Removed v_dataReaderInstance re-insert |
OSPL-4974-1 |
DDSI TCP on Windows fails with large messages
When using large message payload, DDSI TCP would hang because the TCP buffer would overload. Solution: Error handling improved for blocking TCP write functions. |
OSPL-4974-1 |
Change default value for DDSI TCP configuration property NoDelay
Solution: Changed the NoDelay DDSI TCP configuration property to true (from false) to reduce jitter. |
Report ID. | Description |
---|---|
OSPL-4777-1/ 12989 |
In some situations it is possible that the durability service
sends out responses to requests in the reverse order. As a resul
recipients of these responses may perceive a "wrong" state of groups
and namespaces.
Due to a threading issue it is possible that a durability service sends out responses to request in the reverse order. In particular, the state of groups and namespaces could be reversed, causing recipients to believe that a group is 'incomplete' while in fact the master has announced is completeness. In this case the recipients will wait forever to become complete. Solution: Once a group is complete it can never announce its 'non-completeness' anymore. Also, order reversal of announcing namespace states has been prevented. |
OSPL-4942-1/ 13042 |
In rare occasions a process could fail to detach properly from
SHM due to a race condition
Due to a race condition in checking whether a live process is ready to detach, the process could conclude that it still had threads in SHM, causing the detach to fail unexpectedly. As soon as this was detected by spliced, the domain was brought down. Solution: The race condition has been resolved. |
OSPL-5053 |
DDSI "malformed packet received" error with state parse:info_ts
DDSI verifies the well-formedness of the messages it receives, logging a "malformed packet" error if it is not. The message validator would reject a short timestamp even when the INVALIDATE flag was set on the submessage. Solution: DDSI now accepts empty timestamp submessages with INVALIDATE set. |
Report ID. | Description |
---|---|
OSPL-3868 |
DDSI2 group instance leakage in shared memory
The DDSI2 service caused a small memory leak in shared memory fo each group instance written. Solution: Memory is now freed. |
OSPL-4095 |
DDSI stops when creating many readers/writers and no peers exist
The DDSI discovery protocol exchanges information on all endpoints, and does so by creating an instance in a reliable, transient-local writer (one for writers, one for readers) for each existing endpoint. An issue was discovered where this data is counted as unacknowledged data even when there are no peers, and creating readers/writers may cause DDSI to reach the maximum allowed amount of unacknowledged data in the writer. This in turn blocks various processing paths, and if there are no peers, there is no way out. Solution: When there are no peers, the data is no longer counted as unacknowledged data. |
OSPL-4607/ 12961 |
Deleting a writer does not free all shared memory allocated when creating it
Solution: Memory allocation is now tidied |
OSPL-4713-1/ 12978 |
Readers not disposing after using built-in subscriber
After termination, even after calling 'delete_contained_entities()' on the DomainParticipantFactory, the tester showed the participant as disposed with active DataReader for built-in topics. The 'delete_contained_entities()' function did not delete all contained entities and the built-in subscriber was not deleted. Solution: When calling 'delete_contained_entities()' on the DomainParticipant the built-in subscriber is now also deleted. |
OSPL-4713-2/ 12978 |
Shared memory runs out after running a very simple application in a loop
A small memory leak (96bytes ::v_message<kernelModule::v_participantInfo>) made it possible to run out of shared memory when a very simple application (factory 'get_instance()', 'create_participant()' and exit) was run in a forever loop. Solution: Memory is now freed. |
OSPL-4713-3/ 12978 |
Shared memory leakage on Windows platforms
On Windows platforms our exit handler was registered but never called when 'get_instance()' was called from the customer application context, this caused memory leakage. Solution: Our exit handler is now also called on Windows when our library is unloaded. However Windows terminate threads ungracefully which could still cause memory leakage if 'delete_contained_entities()' is not called before library unloading. |
OSPL-4717/ 12982 |
When using wait_for_historical_data_w_condition with OR codition
its possible that not all matching samples are returned.
When wait_for_historical_data_w_condition is used the evaluation of the condition is not correct. If the conditions consists of OR expressions then not all parts of the OR expression are evaluated. Solution: The evaluation of the condition now walks over all OR elements of the condition and does not stop when it finds a match. |
OSPL-4727 |
DDSI discovery heartbeat interval too long
The DDSI specification gives a DDSI participant various ways of renewing its lease with its peers, one of which is a periodic publishing of a full participant discovery sample. To reduce the bandwidth needed, DDSI would instead send some other data, but this is not good enough to maintain liveliness with all other implementations. Solution: DDSI now sends the participant discovery sample at an interval slightly shorter than the participant lease duration, which itself is taken from the OpenSplice domain expiry time, but never longer than the configuration SPDP interval. |
OSPL-4874 |
RnR replayed data not arriving on remote nodes using DDSI
When RnR replays a recording it relies on the networking service to distribute the data from the RnR service to any remote data readers. The writer instance handles used by the replay did not match known writers for DDSI, hence DDSI was unable to determine where to send the data, and ultimately causing DDSI to drop the data. Solution: RnR now creates local data writers and remaps the writer instance handles in the replayed data to correspond to these known writers, allowing DDSI to distribute the data throughout the network. |
OSPL-4962 |
Detecting which participant represents DDSI2 in a federation is difficult
DDSI2 itself acts as a participant in the system, and hence creates a participant at the DDSI level as well. For other DDSI implementations it may be useful to be able to detect which of the remote participants is a DDSI2 service. Solution: DDSI2 has been enhanced to indicate in its discovery information which of the potentially many participants in a federation is the DDSI2 service itself. This enhancement is backwards compatible. |
OSPL-4963 |
Domain ControlAndMonitoringCommandReceiver/Scheduling/Priority setting applied incorrectly
The value configured for the ControlAndMonitoringCommandReceiver/Scheduling/Priority for the domain was applied to the ResendManager thread rather than to the CandM thread. Solution: This has been fixed. |
OSPL-4974 |
DDSI TCP may fail under high load
Under high load a DDSI TCP connection may fail and would not recover. Solution: The socket waitset has been made threadsafe. Under UDP only a single thread accessed the waitset, under TCP multiple threads are used. |
OSPL-4995 |
DDSI default socket buffer sizes increased
The socket buffer sizes have a significant impact on performance, and in particular having a small receive buffer size when data comes in at a high rate can be a performance bottleneck. Unfortunately, there is no agreement across operating systems about the default maximum size, and hence in the past DDSI defaulted to a smallish buffer. Solution: The defaults and the warning policy for failure to set the buffer size has been changed. Without specifying a receive buffer size (or "default"), DDSI will default to requesting 1MB, but accept whatever the kernel provides. Explicitly specifying a buffer size will still result in an error message if the kernel refuses to provide it. |
OSPL-5002 |
Invalid messages accepted by DDSI2
The DDSI2 service verifies the well-formedness of the incoming messages, but two issue in the verification were discovered: firstly, it would accept invalid sequence numbers in data samples, even though the specification explicitly states such messages must be rejected, and secondly, it did not correctly verify that the start of the inline QoS or payload was indeed within the message. Solution: Both points have been corrected. |
Report ID. | Description |
---|---|
OSPL-1084/ 10782 |
When RTnetworking compression is activated the number of network
frames is not reduced.
When RTnetworking compression is activated then compression is applied to each network frame which causes that the size of the frames is reduced but the number of frames remains the same. Solution: When compression is activated the compression of the data is performed before fragmenting and the packing of data messages is performed before compression is applied to the data. Which results in a better compression ratio and a reduced number of fragments (packets). |
OSPL-2023 |
DataReaderView doesn't take modified default DataReaderView QoS
when created (sacpp & ccpp)
The default DataReaderViewQos can be changed by calling set_default_datareaderview_qos() on the related DataReader.When a DataReaderView is created with DATAREADERVIEW_QOS_DEFAULT, then it should take the QoS that was set with set_default_datareaderview_qos(). This didn't happen, the DATAREADERVIEW_QOS_DEFAULT was used as QoS. Solution: During reader->create_view(), check if DATAREADERVIEW_QOS_DEFAULT was provided. If that's the case, then the readers' internal default DataReaderViewQos is used instead. |
OSPL-3789/ 12578 |
The 'autopurge_dispose_all' value is added to the ReaderDataLifecycleQosPolicy.
When calling dispose_all() an a Topic, then related readers will receive a disposed notification for all disposed samples. For performance reasons, it should be possible to block those disposed notifications and only trigger the on_disposed_all() notification on the ExtTopicListener. Also the samples should be disposed automatically. This should be controlled on a 'per reader' basis. Solution: The 'autopurge_dispose_all' is a new value in ReaderDataLifecycleQosPolicy, which is part of the DataReaderQos. When set to 'true' (default is 'false'), it makes sure that all related reader samples are purged when dispose_all() is called on a Topic. The related reader will not be notified that the samples have been disposed by a dispose_all(). |
OSPL-3976/ 12652 |
If master selection in durability takes a long time the system
could stall
At start up, the durability service tries to determine masters for all its namespaces. If during the master selection phase fellows are removed this also triggers master determination. In this case, the latter thread waits for the master selection lock which has been taken by the first thread. If the first thread takes a long time to determine masters for its namespaces (which is typically the case for large systems with many nodes and namespaces) then the second thread is stalled for a very long time. If this time exceeds the thread liveliness assertion period then the second thread is declared dead, which may lead to system failure. Solution: While the second thread is waiting for the first thread to release its lock, liveliness of the second thread is asserted regularly. This ensures that the second is not declared dead, even if initial master selection by the first thread takes significant time. |
OSPL-4041/ 12669 |
Idlpp for C# generates incorrect code when using const to const assignment
When generating code from an idl-file which contains a const variable which is used for assignment to another const variable, idlpp for C# generates incorrect code. The latter assignment was generated as (null), because the C# implementation was missing for this case. Solution: Adjustements made to idlpp tool, it now implements the const to const assignment case. |
OSPL-4070 |
RTNetworking supports setting the Differentiated Services Code Point (DSCP)
field in IP packets on windows
To provide the setting of the DIffserv (DSCP) field in the IP packets the networking service used the IP_TOS option for this purpose. However, since Windows Vista and Window server 2008 setting the IP_TOS option is no longer supported. To use the Diffserv functionality on these versions or later versions of windows the new QoS2 API has to be used. Solution: The networking service must map a configured Diffserv value on one of the Traffic Types supported by the windows QoS2 API. When administrative privileges are avalaible then the configured Diffserv value is set on the traffic flow associated socket which will result that the Diffserv field of the IP packets is set to the configured value. When no administrative privileges are available then the Diffserv field will be related to the Traffic Type that is selected. |
OSPL-4282/ 12770 |
When using OSGi without proper exports a crash will occur.
When using two OSGi bundles, one contains dcps.jar (dcpssaj-osgi-bundle.jar) and the second by idlpp generated typed code without exports and an application. When the second bundle accessed the first bundle which then tried to access a class form the second bundle using the JNI FindClass function a crash occurred because of the thrown exception. Solution: To prevent this crash from happening the exceptions thrown by the JNI FindClass function are now caught and a log message is written to the error log explain what when wrong. |
OSPL-4371/ 12817 |
Unclear logging when services are killed because of elapsed
serviceTerminatePeriod
When the serviceTerminatePeriod elapses during shutdown the ospl-tool logged an ambiguous message to the info log. The Splice daemon should have logged a clearer service kill message, but this was never reached because the ospl-tool would terminate the Splice daemon prematurely. Solution: Clarified service kill messages for both ospl-tool and Splice daemon. Increased ospl-tool wait period before sending kill signal to Splice daemon process group. Additionally, durability now logs messages when it fails to assert its liveliness within the expiration period. |
OSPL-4345/ 12809 |
The adminQueue may overflow when receiving thread is busy
processing messages and the sending thread is not scheduled in time.
The sending thread is responsible for transmitting ACK messages. For that purpose the receiving thread uses the adminQueue to inform the sending thread of the data messages received. When the receiving thread is busy processing received data the adminQueue may get full because the sending thread (lower priority) is not scheduled in time. Solution: The receive thread is made responsible of sending the ACK messages. This has the effect that the timing requirements of the sending thread are relaxed. |
OSPL-4385/ 12821 |
When using edge case resource limits, OpenSplice didn't behave as expected.
When using a resource setting of max_samples=1 and history = KEEP_LAST for the reader, samples weren't overwritten while as expected but an error was returned. Solution: The reader and writer resource limits are now better checked so that samples are overwritten when allowed. |
OSPL-4423 |
User should be aware that a runtime installation of OpenSSL is
required for OpenSplice licensed features and/or ddsi2e and snetworking.
Addition of TLS in ddsi2 removes the static link to OpenSSL in previous versions of OpenSplice on non-windows systems. Solution: At runtime an installation of OpenSSL is required for licensed features, however on most systems this is standard. |
OSPL-4463/ 12863 |
The use of sequences is not supported in multi-domain applications.
The issue is located in the copy-in routines generated by the IDL pre-processor. The copy-in routines are used when the application performs a write operation. To improve the performance of the copy-in routines these routines cache some type information about contained sequences. This causes a problem when writing the same type to multi-domains because the cached type information is domain specific. Solution: An option (-N) is added to the IDL pre-processor which disables the type caching in the generated copy-in routines. |
OSPL-4530 |
Improved DDSI robustness
In high-throughput situations, DDSI2 could behave quite badly, with retransmit storms and/or temporarily considering reliable readers unresponsive and treating them as effectively best-effort. The long default participant lease duration caused these effects to linger for a long time even after restarting part of the application. Solution: The risk of retransmit storms and the associated effects has been reduced by improving the mechanism used to control the rate of retransmit requests and improved control over the amount of outstanding unacknowledged data, by configuring bytes rather than samples. The default participant lease duration is now controlled by the ExpiryTime configured for the domain, and will therefore typically have a more reasonable value. |
OSPL-4534 |
In cases where DDSI generates builtin topics there is no need
for durability to align the builtin topics.
DDSI can discover entities and could generate builtin topic information. This enables non-enterprise nodes in the DDSI network to become visible. Also, in cases where DDSI generates builtin topic information there is no need for durability to align builtin topic information, which saves bandwidth. Solution: Durability will not align builtin topics in case ALL DDSI services generate builtin topics and no native networking services are configured. To force DDSI to generate builtin topics DDSI can set their |
OSPL-4596/ 12954 |
A sample predating the oldest sample in the history of a TRANSIENT
or PERSISTENT instance could overwrite a newer sample
Due to a fault in the mechanism used to insert a sample in the history of a TRANSIENT or PERSISTEN instance, a sample predating the oldest sample in the history would replace the oldest sample instead of being discarded. This would cause late-joining readers to observe an inconsistent history. Solution: The mechanism has been fixed to properly order the samples, so the oldest sample will be discarded when it doesn't fit in the history instead of overwriting the oldest sample. |
OSPL-4596-1/ 12954 |
After durablity alignment of a dispose_all message only the first
instance is NOT_ALIVE_DISPOSED
When the durability service needed to align a fellow, a stored dispose_all message was only sent to the first instance for the topic. The dispose_all sample for following instances was incorrectly marked as duplicate because it only tested for writerGid, writeTime and sequenceNumber. Solution: The durability service now also checks samples for keyvalues before marking as duplicate. |
OSPL-4596-2/ 12954 |
After durability alignment of a dispose_all message and a
delete_dataWriter the instance_state does not go to NOT_ALIVE_NO_WRITERS
When the durability service needed to align a fellow with a dispose_all message, an implicit registration message is created for a NIL writer. This NIL writer was never removed causing the dataReader instance_state to remain ALIVE. Solution: The durability service now also sends an implicit unregister message after it sent a implicit register for a NIL writer. |
OSPL-4596-3/ 12954 |
When a dataReader instance_state is NOT_ALIVE_NOWRITERS and it
receives a dispose_all the instance_state does not transition to
NOT_ALIVE_DISPOSED
When a dataReader is in NOT_ALIVE_NOWRITERS instance_state and the last action was TAKE the instance pipeline is destroyed. The group updates it's state when a dispose_all message is received, but could not forward it to the dataReader. The dataReader instance_state did not change. Solution: The group now checks all dataReader instances, when a dataReader has NOWRITERS implicit registration and unregister messages are send so that the instance pipeline is reconstructed and destroyed after the dispose_all is received by the dataReader. |
OSPL-4614/ 12392 |
Durability does not handle merge policy correctly in some cases
with terminating and (re)connecting fellows
Solution:
|
OSPL-4777/ 12989 |
A deadlock can appear when durability tries to use the KV store
during initial alignment, which causes durability to halt forever
During initial alignment access to the KV store may be required. In this phase of the process two threads a competing for two resources, the durability administration and the store. These threads try to lock the resources, but in a different order. This could lead to a deadlock of the durability service. Solution: The KV store does not require a lock on the durability administration anymore. This will prevent the deadlock. |
OSPL-4800/ 12990 |
Multiple Ctrl^C can cause a crash in the exit-request handler.
Termination requests received in rapid succession could cause a crash in the exit-request handler. Solution: The handlers installed by services are now executed only once. |
OSPL-4896/ 13015 |
The durability persistentDataListener thread failed to make progress when
using the KV store, causing the system to terminate and execute its failure action.
The KV store uses transactions to persist data. In case there are many samples to persist, the transaction can take a very long time. This may even outlive the time to assert liveliness. When the time to complete such transaction exceeds the time to assert liveliness the reposnsible thread is declared dead and no leases will be renewed anymore, causing the system to execute its failure action. Solution: To prevent that the persistentDataListener thread cannot make progress two improvements have been implemented. The first improvement is to use the liveliness expiry time instead of the heartbeat expiry time to decide if assertion of liveliness has succeeded or not. The first is typically larger than the second, causing the system to implement a more relaxed liveliness assertion policy. The second improvement is to ignore liveness checking in case of potential intensive operations on the KV store such as commit and delete. |
OSPL-4916/ 13019 |
OSPL Source build required MICO and had kvstore library names incorrect
Customers with access to OSPL Source Build noted a dependency on MICO for building source code and that some libraries links were incorrect. Solution: MICO is now optional and links named correctly. |
OSPL-4920/ 13020 |
When networking compression is used then occasionally an error
"Received incorrect message" is reported.
When compression is activated in the networking service and when a compressed network frame that is received contains user data messages for which the type is not or not yet known on the node then the networking service is not able to de-serialize that user data message and should skip this message and continue with the next user data message in the frame. However in that case the buffer administration is not correctly updated resulting in the reported error and the rest of the frame to be dropped entirely. Solution: The buffer administration has to be updated correctly when skipping a user data message for which the type information is not known. |
OSPL-4938/ 13038 |
Overflow for network queue resulted in a stackoverflow during cleanup of network reader
Unregister messages were not being obeying max queue size. Solution: Changed check for message size to reject a message when queue is equal or greater than max queue size as unregister messages can increase the queue size beyond max size. |
OSPL-4942/ 13042 |
Reason for termination of domain not reported in all situations
The splice-daemon attempts to clean up shared resources of processes that terminated without cleaning them up. If it fails to do so, it does not report anything in the log files in some situations before stopping the domain. Additionally, if the cleaning up did not complete within 1 second, the splice-daemon assumed that cleaning up had failed. Solution: Extra logging has been added to ensure the reason for stopping is clear for users. Furthermore, the time out for cleaning up has been slaved to the existing lease expiry time-out (//OpenSplice/Domain/Lease/ExpiryTime) instead of a fixed period of 1 second. |
OSPL-5020/ 13068 |
Inconsistency between report level verbosity and reports for FATAL and CRITICAL verbosity
Reports at levels FATAL and CRITICAL are emitted as "FATAL ERROR" and "CRITICAL ERROR" respectively. This is not consistent and causes open and close tags with whitespace included (e.g., "<FATAL ERROR>...</FATAL ERROR>" for the report-plugin. Solution: The reports are now emitted with text FATAL and CRITICAL, corresponding to the verbosity level. |
Report ID. | Description |
---|---|
OSPL-4384/ 12820 |
Nested IDL modules not properly handled by the C++ RMI compiler
With rmipp, the handling of nested IDL modules was generating incorrect code. Solution: A bug in the rmipp code generators has been fixed, so that nested IDL modules map properly onto nested C++ namespaces. |
OSPL-4517/ 12882 |
Deadlock in listenerEvent when terminating domain and calling
delete_contained_enties
A deadlock could occur when an application created a domainParticipant with listeners and the domain was terminated while the application called deleted_contained_entities. The listernerEventThread would remain in infinite wait for never signaled waitset, because notify fails due to no longer running splice daemon. Solution: The listenerEventThread now has a polling wait loop, allowing it to detect stop requests. |
Report ID. | Description |
---|---|
OSPL-4046/ 12643 |
The cmsoap service crashes when there is no network connection
present at startup of the service.
The cmsoap service tries to determine the IP address through which it can be reached. These IP addresses are set in the user data field present in the DCPSParticipant builtin topic which enables other tools to connect to the soap service. However when no IP address a crash occurs because of access to uninitialized memory. Solution: When the cmsoap service cannot detect an IP address it should use the loopback IP address instead. |
OSPL-4079/ 12677 |
The write method incorrectly returns TIMEOUT in case there are not
enough resources available or can be freed in time.
When the write method detects that there are not enough resources available and it will not be possible that resources will be available in time, e.g. max instances exceeded is should return OUT_OF_RESOURCES instead of TIMEOUT. Solution: In case the max instance resource limit is exceeded return OUT_OF_RESOURCES |
OSPL-4410/ 12826 |
Deadlock in parallel demarshalling termination
When starting and stopping parallel demarshalling within a short time window it was possible that the parallel demarshalling termination was stuck in a deadlock. Not all spawned threads would terminate, because the terminate flag was reset before all parallel demarshalling threads were operational. Solution: The API set_property function with property name parallelReadThreadCount is now blocking until all parallel demarshalling threads are started and operational and the terminate flag is now reset upon (re)start of parallel demarshalling. |
OSPL-4464/ 12864 |
"FATAL ERROR Open Splice Control Service status monitoring failed.
Exiting." logged when sending signal to blocking OSPL tool.
When "ospl -f start" was executed and a signal was sent to the OSPL tool, a FATAL ERROR message was logged. Not a real FATAL ERROR because the part of the OSPL tool that monitored the liveliness of the splice daemon wasn't aware of incoming signals and logged a FATAL ERROR while the splice daemon terminated normally. Solution: Made the part of the OSPL tool that monitors the liveliness of the splice daemon aware of termination caused by a received signal. |
OSPL-4490/ 12859 |
When DataWriter exits unnaturally LivelinessStatus is incorrect.
When DataWriter exits unnaturally the LivelinessStatus was updated incorrectly when the DataWriter was Alive, this caused an illegal state transition. Solution: LivelinessState change for unnatural DataWriter exits now use the last known state before transitioning to DELETED. |
OSPL-4509 |
DDSI2E now accepts DDSI2 configurations
Solution: DDSI2E required that the configuration in the OpenSplice XML configuration file was tagged "DDSI2EService", but this made it impossible to switch to the DDSI2E service without changing the configuration file. DDSI2E now also accepts configurations under the DDSI2Service tag. |
Report ID. | Description |
---|---|
OSPL-878/ 10549 |
The 'ospl start' command can exit before the DDS Domain is up.
On somewhat slower systems, the 'ospl start' command can exit before the DDS Domain is up. This would mean that creating a DomainParticipant immediately after 'ospl start' can fail. Solution: The 'ospl start' command now waits until the DDS Domain is up before exiting. |
OSPL-4333/ 12801 |
Java language binding fails with multiple package redirects
Java language binding fails when multiple packages are redirected and a type containing a type from a redirected package is registered. Solution: Pass all redirect instructions for all types to the Java language binding. |
Report ID. | Description |
---|---|
OSPL-4116 |
A potential crash could occur when durability terminates
The durability service notifies the splice daemon too early in case durability is about to terminate. This could lead to a situation where the splice daemon already destroys shared memory while durability is still busy cleaning up objects and thus acessing the shared memory. This could lead to a system crash. Solution: The durability services will now notify the splice daemon after it has cleaned up all objects and no access to shared memory is needed any more. Now the splice daemon can safely destroy the shared memory. |
OSPL-4250 /12744 |
JNI attach listener crash results in OpenSplice crash without any error report
When a crash occurs inside JNI call that attaches the listener thread to the application, OpenSplice crashes without a proper report Solution: A proper error report is now generated so the customer knows what went wrong. |
OSPL-4251 /12753 |
Late joining readers not getting complete historical data when more
than one networking service configured
When more than one networking service is configured duplicate message may be received. A reader will filter these duplicate message. However the group does not filter these duplicates and when the corresponding Topic QoS has a history depth greater than 1 this may result in that the duplicate message are stored in the group. This may cause that a late joining reader does not receive the correct number of samples. Solution: The group should check if a duplicate message is received and drop the duplicates. |
OSPL-4270 /12765 |
DDSI2 not supporting QoS changes not documented
The DDSI2 networking service does not (yet) support QoS changes, instead silently ignoring them, but this was not mentioned in the documentation. Solution: This limitation is now stated clearly in the DDSI2 release notes. |
OSPL-4320 |
Java 7 linux 64 bit crash with Listener example
When running the Listener example under linux 64 bit with java 7 it could crash. Solution: The default listener stacksize is 64k, for java 7 this needs to be at least 128k as a result of this the default listener stacksize is increased to 128k |
OSPL-4342/ 12807 |
Deserialisation issues with Java CDR-based copy-out, high-performance
persistent store and RnR binary storage
The CDR serialiser used for Java CDR-based copy-out, the new high-performance persistent store and RnR binary storage could introduce incorrect padding in the CDR stream under some circumstances. To a reasonable approximation, this requires a type of unbounded size, or one where the maximum size is several times the minimum size, AND where the content of the data results in a serialised size larger than 16kB, AND where a string, sequence or array with alignment of less than 8 bytes requires a new block at a time the CDR stream is not aligned to a multiple of 8 bytes. Solution: The CDR serialiser now maintains the alignment of the stream when it switches to a new block. |
OSPL-4366 |
Durability service may crash during termination
During termination, the durability service stops its threads and cleans up its administration. Due to the fact the main thread cleans up some part of the administration that is used by the lease thread before ensuring that thread stopped, that thread may access already freed memory which may cause the service to crash. Solution: The main durability service thread now ensures the lease thread has stopped before freeing the administration. |
OSPL-4390 |
SOAP service may crash when concurrently using and freeing the same entity
The SOAP service allows ADLINK tools to connect to a node or process remotely. Due to the multi-threaded nature of the service, multiple requests can be handled concurrently. When the same entity is concurrently accessed and freed during two or more requests, the service may crash due to the fact one of the threads is trying to access already freed memory. Solution: The internal API of the SOAP service has been re-factored to claim an entity when it is used and release it afterwards. When an entity is freed when one or more claims are still outstanding, new claims are denied and the actual deletion is postponed until all of the outstanding claims have been released. |
OSPL-4421 |
Error reports about instance handles are mixed up
Each call that has an instance handle parameter as well as a sample parameter on the DataWriter entity (like for instance the write call), validate whether the provided instance handle belongs to the DataWriter and if so validates whether the key-values in the sample match the key-value that is associated with the instance handle. If one of these conditions is not true, an error is reported and the call fails. However, the errors that are printed in the two failure cases have been mixed up causing the wrong error message to be reported in both these cases. Solution: The error reports have been updated to match the actual error that occurred. |
Report ID. | Description |
---|---|
OSPL-4286 12771 |
The read_w_condition may incorrectly return no data when a
read_next_instance_w_condition is called before.
The read_next_instance_w_condition may incorrectly set the no data property of the associated query to indicate that no data matches the query, This may cause that a following read_w_condition may return no data when data is available. Solution: The read_next_instance_w_condition should only set the no data property of the associated query when a complete walk is performed on all the instances. |
OSPL-4327 12797 |
A crash of the durability service may occur when samples containing
strings with non-printable characters are stored in the XML persistent
store.
When the persistent XML store contains samples with strings containing non-printable characters then the durability service by crash because the layout of the XML storage file is not as expected. Solution: The XML serializer used by durability to serialize the samples for the XML store should escape the non-printable characters. |
Report ID. | Description |
---|---|
OSPL-4173 12731 |
The notification of a sample lost event by the networking service may result in a crash of the networking service.
When the networking service detects that samples have been lost it tries to notify the corresponding readers of this event. The sample lost event is recorded in the status of the reader. However there are internal readers used by Opensplice which do not have an associate status object causing the crash of the networking service. Solution: Apply all internal readers with an status object. |
OSPL-4269 12760 |
Ospl reports an error if no persistent file is present and using KV Store.
With KV Store, at ospl startup, if no persistent file is available, ospl reports an error: getset_version: read of version failed. Solution: The error message has been fixed. It was not required. No behavior change. |
OSPL-4319/4330/4344/4346 12778/12779/12808 |
Crashes due to shared memory allocator issue
A refactoring of common code introduced an issue in the shared memory sub-allocator dealing with allocating "large" objects that could result in crashes or reports of heap corruption under high-load scenarios. For crashes, this typically (but not necessarily) involves stack traces involving the "check_node" function. Solution: The specific changes have been reverted until they can be corrected and re-tested. |
OSPL-4322 12747 |
Nullpointer exception when creating a reader/writer using the Tuner
When using the Tuner and creating a reader/writer it is possible to get a nullpointer excetion when selecting a different topic from the pulldown menu. Solution: The nullpointer is now being caught and will no longer appear. |
Report ID. | Description |
---|---|
OSPL-3445 12408 |
Inconsistent behaviour of service when handling signals.
The service should handle asynchronous signals like SIGQUIT or SIGTERM as normal termination requests which should not trigger an failure action. However the handling of these termination request is not correct, which may result in a normal termination or may result in an exception which triggers the failure action. Solution: When a service receives a termination request signal like SIGQUIT or SIGTERM it will initiate a normal termination of the service and will not trigger an failure action. When the service receives a synchronous signal like SIGSEGV as result of an exception or send asychronously to the service then the service should detach from shared memory and trigger the failure action. |
OSPL-3695/ 12553 |
Shared memory consumption would increase to unacceptable high levels
when using KV persistency
When KV persistency is enabled shared memory consumption would reach unacceptable high levels. This was caused by the following two phenoma. First, the StoreSessionTime configuration option was not respected causing the system to store KV samples as long as there are samples available. Second, an inefficient algorithm to store samples on disk was used, resulting in (expensive) disk access for every sample. Together, these phenoma caused the system to pile up samples in memory that cannot be stored on time. Solution: A more efficient algorithm is used to store samples on disk, resulting in disk access for a set of samples instead of an individual sample. This will boost performance when writing samples to disk. This reduces the risk to pile up data in memory that caused the unacceptable high level of memory consumption. Also, the StoreSessionTime is respected. |
OSPL-3866 |
On windows, setting a long lease time in the configuration results
in an large error log during ospl stop
When a termination request was made, some services with long lease times set in their configuration would generate on windows a large amount of error messages, as the termination was not acknowledged during the lease time. Solution: The sleeping lease thread from ospld, durability and soap are signalled to stop during the sleep in case of a termination request. |
OSPL-4075 12678 |
The network partition mapping of the expression . does not function correctly.
When the network partition mapping expression . is evaluated to find the best match the the global partition is selected instead of the configured network partition. Solution: Exclude the global partition from the search for a best matching network partition and select the global partition only when no other network partition can be found that matches the mapping expression. |
OSPL-4205 12740 |
The read_instance method sometimes returns ALREADY_DELETED but the
reader entity has not been deleted.
This situation occurs when the instance that is referenced by the instance handle supplied to the read_instance has been deleted. For example when the instance has become disposed and unregistered. In that case the instance handle becomes invalid. The return code ALREADY_DELETED is incorrect and should be BAD_PARAMETER to indicate that the instance is not valid anymore. Solution: When the read_instance detects that the provided instance handle has become invalid then return BAD_PARAMETER. |
OSPL-4245 12748 |
Linker error in custom library compilation for CORBA C++ cohabitation
with V6.4.0
DDS_CORE is not set anymore in custom lib environment so cannot be used anymore by the linker. Solution: Changed custom lib makefiles by replacing DDS_CORE environment setting by ddskernel. |
Report ID. | Description |
---|---|
OSPL-31 |
idlpp did not generate valid java-code when a union had a case
called 'discriminator' When an idl-union contained a case called 'discriminator', the generated java-code would contain two conflicting definitions for the 'discriminator' method. This method is always included in a class generated from a union to obtain the value of the union-discriminator. With a discriminator case an additional discriminator method is added with the same signature that returns the value of the discriminator field. Solution: The solution is according to the IDL to Java specification which prescribes that the function returning the discriminator-value should be prefixed with a '_' if the union contains a case called 'discriminator'. |
OSPL-1631 11194 |
Reporting does not include timezone information In scenario's where nodes are joining and/or leaving a domain The timestamps in the default info and error logs did not include timezone information. When the timezone of a system is altered while OpenSplice is running, the reports may appear out of order. Solution: To resolve any uncertainty the locale-dependent abbreviated timezone has been added to the date format. |
OSPL-1705/1713/1714 |
Durability service XML persistency handles topics with string keys
incorrectly If multiple string keys exist for a Topic that is being persisted by the durability service, samples for different instances may be interpreted as samples for the same instance causing potentially samples to be overwritten while they both are supposed to be maintained. Secondly, if one or more key-values contained new-lines, storage in XML was also done in a way that prevented the data from being republished correctly after system restart. Finally, if the string key was matching the "" closing tag in the XML implementation, samples matching this key would not be persisted in all cases. Solution: Key-values are now escaped when storing them in XML. The change is backwards compatible meaning that the new version can cope with old persistent stores. |
OSPL-2360 |
Strange error message from DDSI2 for truncated packets on Windows On Windows, when a message is truncated because there is insufficient receive buffer space available, the error message produced by DDSI2 would be somewhat confusing, because Windows reports this as an error whereas DDSI2 assumed POSIX behaviour of treating this as an unusual situation rather than an actual error. The behaviour of DDSI2 was ok, in that it discarded the message regardless of the platform. Solution: The error reported by Windows is now recognised and reported properly. |
OSPL-2485 |
Idlpp generated invalid java for a union with only a default case When a union only contained a default-usecase, java-code was generated which did an invalid check on discriminator and therefore did not compile. Solution: The discriminator-check is not valid when there is only a default case. The applied fix removes the check in this scenario. |
OSPL-2519 |
OS_INVALID_PID is accepted as a valid processID in the abstraction layer The OS abstraction layer functions os_procDestory and os_procCheckStatus accepted OS_INVALID_PID as a valid input. Especially in os_procDestory, which is able to send signals to processes, this could have caused undesired behavior. Solution: The functions os_procDestory and os_procCheckStatus now return invalid when processID OS_INVALID_PID is passed. |
OSPL-2616 |
Internal change: File extension change for files generated for Corba co-habitation CPP When generating from your idl file, the tao idl compiler would generated .i. Solution: For the inline files these have now been changed to the default file extension used by TAO, which is .inl. |
OSPL-3042 |
The library versions of sqlite and leveldb supplied by Opensplice
may conflict with system supplied builds. An Opensplice delivery contains particular versions of the sqlite and leveldb libraries on which Opensplice is dependent. These libraries are installed in the Opensplice install directory. owever the versions of these libraries may conflict with newer versions which are available on the system on which Opensplice is installed. Solution: The names of the supplied sqlite and leveldb libraries are made Opensplice specific by adding an ospl postfix. |
OSPL-3151-1 |
Receiving unidentifiable duplicate messages during durability
alignment when using DDSI2 When using DDSI2 it was possible that during durability alignment a duplicate message was received which could not be identified as a duplicate because the sequencenumbers were different. DDSI2 increments the sequencenumber for each message it sends, this sequencenumber is unrelated to the message sequencenumber which is not communicated to the receiving node when using DDSI2. Solution: Communicate the message sequencenumber to the receiving node. Add a ADLINK specific flag to the SPDP to indicates that the message sequencenumber is send. Add the message sequencenumber to all messages transferred using DDSI2. Based on the existence of the ADLINK specific flag copy either the message sequencenumber or the DDSI2 sequence number into the internal messages. |
OSPL-3151-2 |
Publication/Subscription matched logic incorrect On every non dispose Publication/Subscription matched message (with compatible reader/writer) the Publication/Subscription matched count was incremented. Only with a dispose Publication/Subscription matched message the Publication/Subscription matched count was decremented. Solution: Now the Publication/Subscription matched count is only incremented when it's noticed for the first time or when QoS settings have become compatible when it was not before. The Publication/Subscription matched count is decremented on dispose Publication/Subscription matched message or on Publication/Subscription matched message when QoS settings are no longer compatible. |
OSPL-3452/ 12411 |
Limiting sample size with DDSI2 Solution: DDSI2 now allows setting an upper limit to the allowed size of the serialised samples, as an added protection mechanism against running into memory limits. The limit is applied both on outgoing and on incoming samples, and any dropped samples are reported in the info log. By default, the limit is 1 byte short of 2 GiB. |
OSPL-3462 |
idlpp issues compiling in standalone C++ mode Description: idlpp generated uncompilable code from anonymous sequence of sequences of basic IDL types following typedefs of those same basic types. It also produced incorrect definition of anonymous array slice types. Solution: idlpp generates the correct code in these circumstances. |
OSPL-3463 |
Durability using KV persistency may report error while backing up Description: When durability has been configured to use KV persistency and is backing up the persistent store an error may be reported when no data exists yet for a given name-space even though nothing goes wrong. Solution: The error message is not reported any more. |
OSPL-3512/ 12427 |
Potential crash during initial alignment after a dispose_all_data call. Description: The dispose_all_data call creates specific samples that were not compatible with durability alignment. The durability service could not handle these samples, while there are possible scenario's where these samples get stored in a persistent store. The service incorrectly forwarded all initial alignment data to the networking service, which could result in a crash since it could also not handle these samples, which are meant for local delivery only. A crash could also occur if the dispose_all_data sample was the first sample to be received, which could happen because of order reversal during alignment or in combination with lifespan QoS on the corresponding data. Solution: The durability service was modified to only exchange initial alignment data over the durability partition and not by directly delivering it to a networking service. Order reversal during initial alignment was changed such that samples are ordered first by timestamp, then by writer (instead of the other way around). Support was added for handling the case where it is still the first to be received, i.e. when the lifespan of data samples has expired. |
OSPL-3520 |
Opensplice host and target type for Linux OS's has changed from
x86.linux2.6 to x86.linux and x86_64.linux2.6 to x86_64.linux Description: The installation path will be affected by this change, the top level directory on a linux platform would have been <ARCH>.linux2.6 it will now be <ARCH>.linux Solution: Before - PrismTech/OpenSpliceDDS/V6.3.2/HDE/x86_64.linux2.6/ Now - PrismTech/OpenSpliceDDS/V6.4.0/HDE/x86_64.linux/ |
OSPL-3587 |
Durability exposes and aligns local DDSI2 partitions Description: The DDSI2 service creates some local partitions formatted as '__NODE Solution: The durability service now checks whether a partition has the aforementioned format and refrains from exposing them to other durability services. |
OSPL-3601 |
Extend support for network service data compression and durability datastores. Description: Windows, Enterprise Linux and Solaris distributions now include support for zlib, lzf and snappy compressors in the networking service, and for LevelDB (not windows) and SQLite (not solaris) datastore plugins for durability. Solution: Extra platform support added. |
OSPL-3603 |
Various internal trace messages are reported in ospl-info.log Description: The ospl-info.log shows various messages about internal threads being started and stopped. These messages make no sense nor are they relevant to users. Also these message makes it harder to find the actual messages that are important. Solution: These internal messages are no longer reported. |
OSPL-3625/ 12529 |
Spliced could crash when terminating under abnormal circumstances Description: The spliced exit handling must stop all OpenSplice threads accessing shared memory before detaching from the shared memory. However, if processes have been killed using the KILL signal, this did not always happen correctly because spliced would incorrectly assuming the shared memory to still be in use. Solution: The termination code now ensures thread termination. |
OSPL-3644 |
Durability service may perform alignment multiple times Description: The durability service aligns sample per partition-topic combination and in some cases could perform this alignment multiple times. Solution: The durability service now checks actively whether it already performed alignment for a given partition-topic combination before initiating the alignment. |
OSPL-3755-1 12573 |
Durability service does not terminate in time. Termination hung because listener termination is unable to stop with active listener actions. Listener actions remained active because they were unaware of termination. Solution: Listener actions are now aware of termination and stop. |
OSPL-3755-2 12573 |
Services are able to outlive the splice daemon When the splice daemon terminated without the use of the ospl tool, it was possible that a service outlived the splice daemon. Solution: The splice daemon now kills all services which remain alive after the service terminate period elapsed. |
OSPL-3755-3 12573 |
After failure action systemhalt it was possible that shared memory
was not cleaned-up/deleted. When a service failed with failure action systemhalt set it was possible that shared memory and key-file were not cleaned-up/deleted. This occurred because the splice daemon incorrectly assumed that the died service was unable to decrease the kernel attach count upon its termination, the attach count was decreased twice causing failing calls to shared memory cleanup/deletion. Solution: The splice daemon no longer assumes died services are incapable to preform proper termination, it now only decreases kernel attach count during termination when all services are terminated. |
OSPL-3761/4002 /12570 |
OSPL_LOGPATH included in host:port check for tcp logging mode Description: Log file names are checked for host:port combinations twice. The second check is done when the path prefix and log file name are concatenated, which leads to incorrect behavior if the value specified in OSPL_LOGPATH contains a colon. Solution: Split prefix and log file name before checking for a host:port combination. |
OSPL-3786 |
Latency spikes on reliable channel Description: On some occasions the latency on a reliable channel would spike periodically (at most once per resolution) due to a mechanism used to limit bandwidth kicking in even when the limit didn't need to be enforced. Solution: The logic has been enhanced to only activate the mechanism when bandwidth needs limiting. |
OSPL-3791/ 12576 |
More strict SAC idlpp sequence support functions creation Description: SAC idlpp creates support functions for sequences (like allocbuf()). These are created when an idl file defines an actual sequence (like sequence Solution: When creating a sequence, check if the sequence is within the same file as the type. If not, check if the type has a keylist related to it. If so, then the type is a topic and the sequence support function have already been created: do not create them a second time. |
OSPL-3851/ 12627 |
DDSI uses more ports beyond those specified in the Deployment Guide Description: The Deployment Guide describes exactly which set of ports is used by DDSI and how this set can be configured. Some versions of OpenSplice (6.3.x except 6.3.0) additionally used two or more (one more than the number of configured channels) kernel-allocated port numbers strictly for transmitting data. Solution: The use of the additional ports has been eliminated and the behaviour is in line with the deployment guide again. |
OSPL-3853 |
Improve the performance of the waitset wait operation. Description: The performance of the waitset wait operation can be improved by evaluating the conditions trigger status within the kernel layer. Solution: Evaluate the trigger status of the conditions attached to the waitset within the kernel layer. |
OSPL-3860 |
Remove unnecessary allocation of a timestamp when updating the
deadline administration. Description: When updating the deadline information of a writer or reader a new timestamp is allocated. By using the timestamps already present in the corresponding sample the extra timestamp allocation can be removed. Solution: Use the timestamps present in the sample when updating the deadline information of the corresponding instance. |
OSPL-3861 |
Improve the performance of the read/take operations by updating
the corresponding administration without extra memory allocations. Description: A read or take operation will update the reader administration. For this update memory is allocated. The performance of the read or take operation can be improved by removing the extra memory allocation. Solution: Update the reader administration without allocating temporal memory during this update. |
OSPL-3993 |
The Java QosProvider constructor may throw a NullPointerException Description: When a parse-error occurs, the Java constructor explicitly throws a NullPointerException. This is not in line with the other API's and the language-mapping. Solution: The QosProvider constructor doesn't throw a NullPointerException anymore. Instead the constructor always succeeds and subsequent invocations on the QosProvider will return DDS.RETCODE_PRECONDITION_NOT_MET.The API furthermore has more thorough error-checking within JNI. If an exception occurs, DDS.RETCODE_ERROR is returned instead of an exception. |
OSPL-3997 |
Networking defragmentation buffers refcount issue Description: Static analysis of RTnetworking code revealed a potential issue with the administration of the defragmentation buffers. An atomically modified counter was accessed without atomic access, allowing a potential race-condition. Solution: The counter is correctly accessed now. |
OSPL-4007 /12660 |
Workaround for issue when using Jamaica VM Description: When OpenSplice is used with JamaicaVM, JamaicaVM crashes due to a differnece in how a jni call to NewStringUTF is handled. Solution: A workaround is implemented in OpenSplice to assure that JamaicaVM does not crash anymore. |
Report ID. | Description |
---|---|
OSPL-3887 |
SAC QoS-provider doesn't return DDS_RETCODE_NO_DATA When a QoS cannot be found, the SAC QoS-provider returns DDS_RETCODE_ERROR instead of DDS_RETCODE_NO_DATA. Solution: The API has been changed to return DDS_RETCODE_NO_DATA when a QoS cannot be found. |
Report ID. | Description |
---|---|
OSPL-3732/ 12565 |
DCPSPublication built-in Topic published unnecessarily Even when publication of built-in topics is disabled through the configuration file, the DCPSPublication instance that corresponds to a DataWriter is still write-disposed and unregistered as volatile data when the DataWriter is deleted to allow DataReaders to clean up the resources associated with that DataWriter. However, if no instances are registered by a DataWriter at the time it is deleted, there is no need to publish it. Solution:If built-in topics are disabled through configuration and no instances are registered by a DataWriter when it is deleted, no more write-dispose and unregister of the DCPSPublication are performed. |
Report ID. | Description |
---|---|
OSPL-2452 |
Sample loss using DDSI2 and readers with resource limits Description: DDSI2 could silently drop samples destined for readers at their resource limit, rather than blocking the writer and/or notifying the application or writing messages in the log. Solution: DDSI2 now blocks the data stream until the reader accepts the data, but for an unresponsive reader, it will eventually drop the data anyway. When this happens, a message to that effect will be logged. |
OSPL-2823/3626/ 12526 |
Sample lost administration memory leaks and unlocked modifications Description: The internal administration related to the sample lost mechanism would leak some memory each time it was accessed. Also it could be accessed by multiple threads at the same time without any locking, which could potentially lead to undefined behaviour. Solution: The issues were resolved by freeing the administration when required to prevent memory leaks, and only allowing access while a lock is taken. |
OSPL-2960/ 12243 |
Improved Durability alignment through DDSI2 Description: The Durability Service assumes that all writer/reader connections are available when one is discovered. DDSI2 however discovers one by one. This means that Durability sometimes sends data which DDSI2 will drop (because there's no related connection yet), which will cause Durability to never complete alignment. Also, DDSI2 had a problem that would sometimes increase the discovery time dramatically, which would trigger the Durability alignment issue. Solution: The Durability now creates all readers before enabling the listeners connected to them, meaning that required writer/reader connections are discovered before acting on incoming data. The DDSI2 service increases the writer heartbeat when a new reader is detected, making the writer/reader connection discovery more robust and quicker. |
OSPL-3186 |
DDSI2 could use freed memory in its reordering of incoming samples Description: The sample re-ordering mechanism internal to DDSI2 could use freed memory under very particular circumstances, requiring the number of buffered samples to have reached its maximum size as well as a particular order of arrival of out-of-sequence samples at the high end of the sequence number range. This is unlikely to occur in normal circumstances, using networks of "normal" reliability (such as Ethernet) and with receivers that generally keep up with the data flow. Solution: The algorithm has been adjusted to avoid referencing freed memory. |
OSPL-3187 |
DDSI2 fails to work with small WHC water mark settings Description: DDSI2 would completely fail to work when the WHC water marks were set very low. More precisely, if DDSI2 would block because the number of outstanding unacknowledged messages (N_UNACK) exceeded the high water mark, and its dynamic message packer had buffered so many samples that the readers would not yet be able to acknowledge enough samples to drop N_UNACK below the low water mark, communication would stop. This also affects endpoint discovery. One of the consequences of the inability to operate with low water mark settings meant that it was impossible to operate DDSI2 in the safest configuration, where each sample must be acknowledged before sending the next, thereby potentially causing many more retransmissions. This is especially problematic if the network is unreliable, or the receivers are unable to keep up with the influx of data. Solution: The samples are now always flushed to the network before blocking. |
OSPL-3470/ 12422 |
Compatible typeSupport is rejected without cause Description: When a type was being registered that conflicted with a type that was registered earlier (for example because they had a conflicting keyList), then a PRECONDITION_NOT_MET was being returned, but no descriptive message was included as to the root cause. Solution: A message is now written into the error log and also accessible through the ErrorInfo interface. |
OSPL-3509 |
Minimum networking ThrottleLimit is based on networking FragmentSize Description: When the fragmentsize > throttlelimit, a scenario with high load can cause the throttlevalue to go below the fragmentsize. At that point networking is unable to send any messages. Solution: The minimum value of the "OpenSplice/NetworkService/Channels/Channel/Sending/ThrottleLimit" config element is now based on the "OpenSplice/NetworkService/Channels/Channel/FragmentSize" value. |
OSPL-3528/ 124264 |
Excessive Lease manager logging Description: In some cases where a writer was using the autounregister_instance_delay, the info log file would be flooded with messages stating that the leaseManager did not wake up in time. Solution: Logging has been cleaned up. |
OSPL-3596 |
RnR storage has problem replaying types containing arrays The RnR storage does not correctly handle types which contain bounded array's or sequences. When the array or sequence contain references then the corresponding reference counts are not properly handled. Solution: The algorithm now correctly updates the reference counts when objects are copied. This includes references contained in bounded array's and sequences. |
OSPL-3636/ 12534 |
Communication does not restart after ethernet cable unplug/replug Description: When the network cable is unplugged and then replugged the reliable communication is not restored when reconnect is enabled and discovery is disabled. Solution: When discovery is disabled the send component of a channel detects the network communication problem and the recovery of the network communication. This status change should also be communicated to the receive component of the channel which should reset it's status in case a reconnect occurs. |
Report ID. | Description |
---|---|
OSPL-3281/ 12378 |
To monitor the behaviour of the
networking service and the durability service additional statistics of the
internal queues have been added. Description: The networking service maintains internal queues to provide reliability and fragmentation/defragmentation. The durability service maintains a queue for persistent data that has to be stored on disk. To monitor the behaviour of these queues extra statistics have been added which will be available through the C&M API. Solution: For the networking service additional statistic counters are available on the queues used by the resend administration (ResendQueue), the reorder administration (ReorderQueue) and the network queue between the writers and the networking service. For the durability service statistic counters are now on the persistent data queue (GroupQueue) |
OSPL-3579 |
Durability service may crash when aligning historical data Description: In some cases while aligning historical data with another durability service, some variable in an internal algorithm may not be initialised, but used and freed later on anyway. This causes undefined behaviour, but mostly leads to a crashing durability service. Solution: The internal algorithm has been modified to initialize the aforementioned variable in any case. |
Report ID. | Description |
---|---|
OSPL-2362/ 11911 |
Ownership strength changes in DataWriter's QoS go unnoticed Decreasing or increasing ownership strength would have no effect on ownership strength registered with instances on DataReader. Solution: Take into account ownership strength during evaluation of QoS updates. |
OSPL-2480/ 11804 |
When an idl files contains "too many" nested modules idlpp crashes
when Java code is generated from the idl files. If an IDL file contains "too many" nested modules the IDL preprocessor idlpp crashes when Java is generated. This is because in Java nested modules will lead to the generation of path names containing the names of modules: the deeper the nesting, the longer the path names will become. Because a fixed size container for path names was used, an overflow would lead to memory failure which could result in a crash. Solution: By dynamically allocating the path names there is always enough room available to hold the complete path of a module, thus preventing an overflow. |
OSPL-2482 |
Removed possible deadlock in d_storeGroupStoreXML if result is
D_STORE_RESULT_PRECONDITION_NOT_MET or D_STORE_RESULT_ILL_PARAM Lock on peristent XML store was not unlocked in case of a D_STORE_RESULT_PRECONDITION_NOT_MET or D_STORE_RESULT_ILL_PARAM result. This would have caused a deadlock. Solution: Lock is now always unlocked in all cases. |
OSPL-2661 |
Crash when using reliable-under-publisher-crash (RUPC) functionality When a publisher node crashes and RUPC is enabled then it appears that a node that has not received all messages from the crashing publisher is not updated by the other nodes. The problem occurs because the announce messages that the RUPC function uses are sent to the wrong network addresses. Solution: The announce messages used by the RUPC protocol have to be sent to all addresses associated with the global partition. |
OSPL-3084/ 12328 |
C++ copyIn/copyOut code generated from idl containing sequences
of anonymous sequences failed idlpp generated invalid code for sequences of anonymous sequences. The copyIn/copyOut routines generated did not recurse into sequences and kept overwriting the base sequence resulting in memory corruption. Solution: Using the loop index to indicate sequence depth during code generation produces copyIn/copyOut routines that recurse into sequences. |
OSPL-3085/ 12326 |
Cast warning in generated idlpp c++ code due to cast from const
pointer into non const pointer The generated c++ code from idlpp contains a copy function for arrays (if present) which cast away a const pointer into a non const pointer. This will cause a warning with strict compiler warnings set. Solution: Generated copy code in c++ for arrays contains now also a const pointer for the from pointer. |
OSPL-3222/ 12371 |
Available traces for the throttling mechanism enhanced to be less performance
intrusive The trace level needed to obtain information on the throttling mechanism could be performance intrusive. Solution: A new tracing category "Throttling" has been introduced for both RTNetworking and secure networking, which allows throttling traces to be enabled separate from the other categories. On level 1 throttling traces are only emitted on change. |
OSPL-3280/ 12379 |
The domain service heartbeat properties should be made configurable seperately. Currently the properties of the domain service heartbeat are controlled by the settings of the lease element. We want to split this coupling because it is desirable to specify the heartbeat frequency independently. Further it should allow to specify the heartbeat transport priority and the scheduling parameters of the heartbeat sending thread. Solution: Added a Heartbeat configuration item to the Daemon element of the Domain service configuration which allows to specify the frequency (expiry time and update factor) of the heartbeat, the transport priority QoS setting of the heartbeat writer and the scheduling parameters of the heartbeat sending thread. |
OSPL-3284/ 12382 |
Manual start of networking doesn't allow communication. The kernel group write is only allowed if the number of registered services is equal to the number found in the configuration. This is done to be sure all networking services are up and running before actually doing group writes. Writes before they are up and running are done again later with a resend, but where a manual start of working is done later, communication doesn't occur. Solution: The number of registered services must be equal or greater then the number found in the configuration to perform a write. A manually started, but later networking service will now join the system. |
OSPL-3355/ 12382 |
When an aligner for a namespace appears and durability
services exist that do not have a aligner for this namespace then no
conflict is detected and no merge action is triggered, although it should. In the scenario where a durability service has its 'aligner="false"' -property set for a namespace and its last available aligner leaves, the durability service will notice that no aligner is available anymore. When an aligner (re)appears for the namespace the durability service should perform the merge action as specified in the configuration, because the aligner may have "injected" new data in the system. Solution: When the last aligner for a namespace has left, the state for the namespace and the role will be cleared. When a new aligner arrives its initial state for the namespace is set to zero. Because the cleared state differs from the initial state a namespace state conflict is detected as soon as an aligner appears and a merge action is performed (when specified in the configuration). |
OSPL-3466/ 12420 |
Report plugin configuration not applied to applications In certain scenario's, a user-defined report plugin was only used by OpenSplice services and not by applications using OpenSplice. Solution: The issue was fixed and the report-plugin configuration is now also used by applications |
Report ID. | Description |
---|---|
OSPL-350/ bugzilla-44 |
Segmentation fault when writing or registering a sample with a null
member in Java saj_cfoiStruct did not check if structObject is NULL, which would result in a segmentation fault when writing or registering a sample with a null member in Java Solution: The saj_cfoiStruct function now checks if structObject is NULL. |
OSPL-492/ bugzilla-49 |
Invalid conversion of multidimensional arrays from IDL to C# Before invalid code was generated for multidimensional arrays, sequences of arrays, and operations on them for C# Solution: Some functions have been updated and use of snprintf has been removed. |
OSPL-2243/ 11714 |
Resend of unregister message may cause a crash. When an unregister message is rejected by the RT networking service then the resend of the unregister message may cause a crash because the corresponding instance administration is already freed. Solution: The RT networking service should not reject an unregister message which is consistent with the behaviour of the other readers. |
OSPL-2799 |
Maximum termination wait time is displayed incorrectly when doing ospl stop When terminating OpenSplice with ospl stop a message appears on the command line with the maximum waiting time. This message was not in line with the possible configured ServiceTerminatePeriod. Solution: The maximum termination time will now be displayed correctly. |
OSPL-2789 |
Compression may fail on Linux when using a large FragmentSize in
RTNetworking When a large FragmentSize was configured in RTNetworking, the service might run out of stack on Linux platforms. Solution: Memory needed for handling the (de-)compression of packets is now (pre-)allocated on heap. |
OSPL-2865/ 12165 |
RT Networking dies when the status of an unused network interface
changes to down. The RT networking service is reported dead when the status of a network interface that is not used changes its status; for example is configured down. This network state change triggers an event within the RT networking service which is not handled correctly which causes that the network service stops updating it's lease. Solution: The function checks the network interface status and correctly handles the network interface status events. |
OSPL-3023.1 |
Could not add priority_kind to Domain/GeneralWatchdog/Scheduling/Priority in osplconf osplconf did not know about priority_kind for Domain/GeneralWatchdog/Scheduling/Priority. Solution: osplconf metadata is updated. |
OSPL-3023.2 |
OpenSplice_DeploymentGuide.pdf referred to old style configuration paths OpenSplice_Deployment.pdf instructed the user to use Domain/Daemon/GeneralWatchdog, but user should use Domain/GeneralWatchdog. Solution: Documentation is correct now |
OSPL-3023.3 |
DDSI2 and DDSI2E services did not recognize both Scheduling element
and the priority_kind attribute The parser for both services had no notion of the Scheduling element, while the common code in the user layer requires it in order to work. Solution: The parser code is updated now and the osplconf metadata file is corrected. |
OSPL-3053/ 12324 |
Crash on deletion of DataWriter. In some cases where a Reader or a Writer with an activated deadline or auto_unregister policy was being destroyed, the leaseManager would still try to notify about missed deadlines or actively send unregister messages to an already partially deleted Reader or Writer. This rarely occurred, but it is clearly unwanted and might crash or corrupt the system. Solution: By now stopping the deadline and auto_unregister algorithms before the actual destruction of the Reader or Writer, this particular sequence of events should no longer occur. |
OSPL-3209/ 12368 |
Durability may fail to align non-volatile data when configuring
delayed alignment for one or more name-spaces With the delayed alignment enabled, durability accepts the introduction of a new persistent data-set in the system even after the start-up phase in case no data has been re-published from permanent storage by any durability service and no application has published any sample so far either. When a new data-set is detected in the operational phase and delayed alignment is required, the durability service marked all partition-topic combinations as incomplete instead of only marking the ones that match the name-space for which delayed alignment is required. As a result the partition-topic combinations that do not belong to the name-space will never be marked as complete again. In case another durability service wants to align from this durability service after that, it concludes that the set of data over there is incomplete, where it actually is complete. This leads the alignment process on the newly joining node to fail. Solution: The algorithm that marks the groups as incomplete (when detecting that delayed alignment needs to take place for a given name-space), has been modified to only mark those partition-topic combinations that match the name-space. |
OSPL-3224 |
Durability KV store may access memory that is already freed. When cleaning up the instance administration it may occur that an already freed reference is accessed. Solution: Set the freed reference to NULL and check if it is not NULL when accessing it. |
OSPL-3254 |
Failed termination messages on windows when closing OpenSplice When terminating OpenSplice with ospl stop on windows a number of messages with the following text "Failed to send the SIGTERM signal to the splice daemon process 0" could appear in the info log file. These messages appear for each service that is configured in the configuration file. In fact, the service is correctly terminated. Solution: These messages will no longer be reported when the services are terminated correctly. |
OSPL-3290 |
When rebuilding custom_libs with Visual Studio the dll files in
$OSPL_HOME/bin are not replaced. When rebuilding custom_libs with Visual Studio the dll files in bin were not replaced. The dll files were produced to the $OSPL_HOME/lib directory and required manually copying into the $OSPL_HOME/bin directory. Solution: When rebuilding custom_libs with Visual Studio the dll files in $OSPL_HOME/bin are replaced. |
OSPL-3342 |
Potential uninitialized memory access in the Java language-binding In the Java language binding there were (internal) error paths in which uninitialized values could be returned. Solution: The return-values have been initialized for the error-case aswell. |
OSPL-3351 |
RT Networking : Parallel demarshalling administration may leak Due to an error in a method used by the cleanup routines to access the parallel demarshalling administration for Java, the related administration may leak if the number of threads is changed or the application stops. Solution: The error in the cleanup routines is resolved ensuring that the right pointer is returned. |
Report ID. | Description |
---|---|
OSPL-2283/ 11750 |
Memory leak in lookup_participant The domain identifier was not freed. Solution: The domain identifier is now freed before leaving gapi_domainParticipantFactory_lookup_participant. |
OSPL-3158/ 12351 |
Memory leak after deleting a waitset The common destructor in GAPI did not free object if object was of type waitset. Solution: Memory leak fixed. |
OSPL-3188 |
Error in throughput measurement of streams example The streams example measures throughput by taking the total amount of samples divided by the amount of time in one run. It consists out of a reader and a writer process which are started independently. Previously the reader would start the time-measurement when the process started, even though the writer-process was not yet running. This resulted in throughput that was too low because more time was spent measuring than was spent sending data. Solution: The example now starts the measurement when it receives the first sample. |
OSPL-3198 |
Durability could crash during startup when doing initial merge Due to a timing dependency on whether the role of a fellow was set the durability service sometimes assumed that the role was available when it wasn't, resulting in trying to read a NULL-pointer. Additionally, the service would use a fast spinning loop to determine whether communication with another fellow was approved and would loop forever if the fellow would not be approved. Solution: The role is now only accessed when it is available. Furthermore the spinning loop that determines whether communication with another service is possible is made less cpu-hungry (by introducing a sleep) and the loop skips a service for which the communication state is not approved instead of looping forever. |
OSPL-3200 |
Consistent final value not always guaranteed with BY_SOURCE_TIMESTAMP Where a single writer updates an instance with the same timestamp, a consistent final value for that instance was not guaranteed across all subscribers. Also when a time with all zeroes was supplied, the actual time would be used instead of the supplied time. Solution: When updating the administration of the readers the consistent final value is guaranteed by incorporating a writer-generated sequence number and time zero has no special meaning anymore. |
OSPL-3218 |
Streams API returns old sample multiple times After the streams get_w_filter API call returned NO_DATA, administration in the StreamReader caused a consecutive get_w_filter to return the last received sample. This pattern (NO_DATA, last sample) repeated itself until new data was received. Solution: The administration is now left in a correct state after NO_DATA is returned. |
Report ID. | Description |
---|---|
OSPL-3096 |
In single-process deployment CTRL-C doesn't work Due to a signal-handler being overruled on POSIX systems, termination requests like CTRL-C (e.g., the signals SIGINT, SIGQUIT, SIGTERM, SIGHUP and SIGPIPE) would not result in an immediate stop of the application. Solution: The signal-handler for single-process deployments is no longer overruled by handlers that don't stop the application. |
Report ID. | Description |
---|---|
OSPL-10 4508 |
TypeSupport with invalid type name causes crash during register_type When a type support object is created with an type name which is not known in the meta database the register_type function crashes. Solution: Code change made to prevent the crash and doc updated to improve descriptin of register_type. |
OSPL-1430 7255 |
Durability Service behaviour with no aligner The current behaviour of a durability service is: when no aligner is available a durability service that cannot act as aligner itself will wait until an aligner becomes available. This is not desirable in all cases. When no aligner is available a durability service that cannot act as aligner itself will wait until an aligner becomes available. If no aligner becomes available, the durability service will wait forever. In some situations it may be desirable to exit instead. This behaviour has been made configurable using the TimeToWaitForAligner-option. Currently two values are supported, 0.0 (exit if no aligner is available) and 1.0 (wait until an aligner becomes available). The default is 1.0 which matches the original behaviour. When the durability service exits error code 1 (recoverable error) is returned. Solution: The behavior of the durability service is configurable when no aligner is present. |
OSPL-1431 10964 |
idlpp multiple prefix support for Java idlpp did not support prefixing multiple modules for the java language binding using the -j option. Solution: idlpp now supports prefixing multiple modules. |
OSPL-2307-1 11751 |
Recursively resolved header files all present in top level header file. idlpp would include all generated header files for C++ for which it found an idl preprocessor directive. Solution: idlpp generated C++ code does not include all recursively resolved header files anymore and instead only references the top level include file. |
OSPL-2307-2 11751 |
idlpp sometimes forgets top level line markers in preprocessor output idlpp would forget to include top level line markers in preprocessor output if the included file contained a idl preprocessor include directive itself and no actual idl code was declared before the include directive. Solution: idlpp preprocessor now prints a line marker for the file it's in, as soon as it finds a preprocessor include directive and the line marker wasn't printed yet. |
OSPL-2696 11998 |
REPLACE and DELETE merge policy Solution: With the REPLACE merge policy it is possible to dispose and delete historical data on a node, and replace it with the transient and persistent data from another node. Immediately after successful completion of the REPLACE merge action the replacement data will be available to late joining readers, the data in the reader queue of existing readers will be disposed and replaced with the replacement data, and the generation count of the replacement data is increased. With the DELETE merge policy it is possible to dispose and delete historical data on a node. Immediately after successful completion of the DELETE merge action the historical data in the reader queue of existing readers will be disposed and is not available any more to late joining readers. |
OSPL-2705 12008 |
Java deadlock during shutdown Newly spawned thread in shutdown hook to delete contained entities causes a deadlock. Solution: In OSPL v6 a user should not request an exit from within a listener thread (until the final solution has been implemented - expected in v7). A solution is to spawn a thread that calls System.exit() instead of calling the method from within the Listener callback itself. OSPL now tries to detect this deadlock, reports an error and calls system.halt. |
OSPL-2762 12045 |
Deadlock on receiving SIGSEGV in Java language binding The signal handler used to install the default signal handler and re-raise SIGSEGV. This caused JVM to pass a SIGABRT to the signal handler thread itself which would try to notify itself, and as a result end up in a deadlock. Solution: Upon receiving a SIGSEGV (synchronous signal) asynchronously the signal handler thread does not change the signal mask and now invokes kill instead of raise to avoid possible deadlocks. |
OSPL-2844 12163 |
Alignment of transient data can cause a crash after several restarts
on a second node where the first one keeps running Node a is started and publishes transient data. Node b is started and aligns with the first node. Result is stored in persistent file. After several restarts of node b the internal hash table in the durability service is out of sync caused by the data from the persistent file and the data received from node a and causes a crash. Solution: Internal hash table in durability service was out of sync with the reality. |
OSPL-2889 12174 |
Writing data with empty strings in Java may not result in correct data being received in DDS Due to an issue with the routines used by the Java language-binding to do empty-string interning, non-empty strings could show up in the data received in DDS. Solution: The empty-string interning mechanism has been fixed to prevent this issue. |
OSPL-2891 12167 |
Data reader statistic "numberOfSamplesTaken" not being updated The "numberOfSamplesTaken" statistic was not being updated correctly in the Tuner. Solution: Fix applied and Tuner works ok. |
OSPL-2894 |
Logger example failed to build on Solaris The logger example failed to build on Solaris due to make complications. Solution: Make file corrected. |
OSPL-2900 12235 |
Instances may be corrupted when using small reader_data_lifecycle.autopurge_*_delay In case a durability_service.service_cleanup_delay > 0.0 is used for a Topic in combination with a very small value (or zero) as reader_data_lifecycle.autopurge_*_delay by a DataReader for the same Topic, instances may be corrupted in that DataReader during the delivery of historical instances in case they had been disposed prior to delivery already, no more live writers exist for it and of which the service_cleanup_delay has not expired yet. Solution: The internal algorithm that deals with the above Qos policies has been altered and no longer causes memory corruption. |
OSPL-2904 12236 |
OSPL_LOGAPPEND does not work as expected The environment variable OSPL_LOGAPPEND does not work like expected. A "true" has the same effect as "false". Only if the variable is not defined do you get the append behaviour. Solution: The evaluation of OSPL_LOGAPPEND has been corrected. |
OSPL-2919 12244 |
Opensplice DDS services always wait worst-case terminate period
in case of a service crash When using Opensplice DDS and for some reason a service crashes or is deliberately crashed, the Opensplice DDS services always wait the complete ServiceTerminatePeriod to exit. Solution: This extra delay in the termination of services has now been resolved and the services will now close as soon as possible. |
OSPL-2972 |
Invalid configuration values on Windows 64-bit On Windows 64-bit platforms, some configuration values were parsed incorrectly, leading to unexpected behaviour. For example the MaxBurstSize of the RTNetworking service didn't seem to work on Windows 64-bit platforms. Solution: The parsing and printing of 64-bit values has been fixed for Windows 64-bit platforms. |
OSPL-2977 |
Appending to a stream may return RETCODE_TIMEOUT if flush limit is reached Users of the Streams API should take into account that the append call may return RETCODE_TIMEOUT, as a result of an implicit flush that hits the writer resource limits. Solution: To handle this situation the following snippet of code is suggested:
|
OSPL-2979 |
Out of range user-friendly configuration values might result in strange values being used. When a out-of-range user-friendly size expression with the suffixes k/K/m/M was used, the reported replacement value wasn't actually used. Solution: When an out-of-range value is specified, the reported replacement value is actually used. |
OSPL-3036 12173 |
Applications using DCPS Java API may crash when creating a
participant in case stack-traces are not available. The DCPS Java API is trying to resolve the class name of the main-class as name of the participant. The algorithm that determines this information was not robust against stack traces not being available causing an ArrayIndexOutOfBoundsException. Solution: Made algorithm robust against not being able to inspect the stack. In this case a warning is issued in the logs and an alternative name is chosen based on process id. |
Report ID. | Description |
---|---|
OSPL-707/2546/2590/2735 10333/11875/11882/12037 |
Crash in memory manager In case of concurrent allocating and freeing small blocks distributed over the memory in a particular way, a free operation could shrink the region of used memory by multiple blocks of memory where it should have shrunk by only one. This in turn could lead to the shared memory allocator to allocate a block of memory twice, most of the time leading to a crash in the allocator itself. Solution: The condition for shrinking the region has been fixed such that this can no longer occur. There is no effect on memory usage. |
OSPL-2246/ 11720 |
Topic content filter doesn't work with no-writers or disposed events Data is not filtered as expected when using a content filtered topic and the events are disposed or the writer is deleted. Solution: Changes have been made to the content filtered topic code so that now the instance part of the filter will be always be evaluated, but the data part of the filter will only evaluated when a valid sample is written. |
OSPL-2303/ 11761 |
OpenSplice DDS services always wait worst-case terminate period in case of a service crash. When using Opensplice DDS and a service crashes or is deliberately crashed, the Opensplice DDS services always wait the complete ServiceTerminatePeriod to exit. Solution: This extra delay in the termination of services has now been fixed and the services will now close as soon as possible. |
OSPL-2535 |
New reader statistic numberOfSamplesLost Solution: The new reader statistic numberOfSamplesLost counts the number of samples that have been lost for that reader during network transportation. |
OSPL-2631 |
wait_for_historical_data_w_condition accepted DURATION convenience types as Time input. The reference manuals advocated the use of DDS_DURATION_ZERO and DDS_DURATION_INFINITE as valid input arguments for the Time_t min_source_timestamp and max_source_timestamp parameters. This is confusing because timestamps are expected, but durations are considered valid input. Furthermore, supplying DDS_DURATION_INFINITE leads to an uninitialized variable, which could potentially lead to non-deterministic behavior. Solution: The reference manuals have been updated, so now it is more clear what parameters are expected. Also the initialized variable bug has been fixed. |
OSPL-2720 |
Configurator Tool does not handle lower case values for sizes Values such as 32k are not accepted by the configurator, but are valid by OpenSplice. Solution: Setting size values with k, m and g are now valid in the configurator. |
OSPL-2754/ 12042 |
ospl tool updated to ensure orphaned key files on unix are cleaned On Windows ospl tool ensures that any orphaned key files (i.e. where there are no running processes present that match the key file) are deleted before starting OpenSplice. Solution: Extended the Windows functionality to Linux, but additionally only tidy orphaned files that belong to the same user starting OpenSplice. |
OSPL-2781 |
Configurator doesn't recognize throttle_limit as a number The Configurator did not allow the use of k, m or g when setting its value. Solution: Configurator fixed. |
OSPL-2787/ 12056 |
DCPS C++ CORBA co-habitation custom library rebuild failed to re-build The DCPS C++ CORBA co-habitation custom library failed to re-build because of a missing macro definition in the supplied makefile. Solution: The missing macro definition has been added to the makefile. |
OSPL-2790/ 12047 |
Missing PID when displaying Java participants in the tuner When using the Tuner in the participant view and the participant application is a Java application which is not a DDS service like the Tuner or the Tester, only the name of the class is displayed, not the PID. When several instances of the same application are running on the same node, it is not possible to distinguish one from another. Solution: The java participant applications now show also a pid behind their name to make it easier to distinguish them. The participant naming is now also consistent with the C, C++ and C# applications. |
OSPL-2804/ 12123 |
Alignment of transient data failed in multiple writer-scenario In a scenario with two writers writing the same instance where the 2nd writer is deleted, alignment of transient data failed because the registration for the first writer was not aligned while the unregistration of the 2nd writer was. Solution: For each registration that has no samples anymore in the store, durability sends an extra registration for that writer. |
OSPL-2816/ 12145 |
Deadlock involving groups using synchronous write Incorrect initialisation of a lock involved in processing acknowledgements on synchronous writes could cause a deadlock when different processes were trying to access the data structure. At the point where this occurs, a lock on the group with which that synchronous write is associated may be held, which could cause hanging processes in various configurations and for seemingly unrelated reasons. Solution: The initialisation of the lock has been corrected. |
Report ID. | Description |
---|---|
OSPL-2763 |
Shared memory leaks on deletion of DataReader. Under certain conditions memory would leak when a DataReader was deleted. Solution: The leak has been fixed. |
Report ID. | Description |
---|---|
TSTTOOL-123 |
Quoted partition name in Tester script is interpreted as part of actual partition name. If a user wanted to execute a reader command in a Tester script that specified a partition with dots or stars in its name, then the only way for compilation of the script to be successful was to surround the partition name in quotes. However, the quotes were then interpreted as part of the partition name when the reader is created. Solution: In a reader command, if the partition name is wrapped in quotes, then the quotes are dropped after compilation. If a user still wanted to include quotes as part of the actual partition name, then they can escape the quotation marks in the partition name in script. |
Report ID. | Description |
---|---|
OSPL-2733 |
Issue in Streams API could result in crash on Windows. A problem was found in the Streams API with the initialization of mutexes and condition variables, that could result in data not being published and/or undefined behaviour of applications build on top of the Streams API on Windows. Solution: The incorrect initialization was fixed. |
Report ID. | Description |
---|---|
OSPL-2729 |
Problems starting OpenSplice when OSPL_URI is quoted. OpenSplice DDS will not start with an OSPL_URI surrounded by quotes (" "). Solution: OSPL Tool now handles a quoted OSPL_URI. |
Report ID. | Description |
---|---|
OSPL-614/OSPL-1843/ 10335 |
New durability persistence store implementation using a key-value storage To store durable data the durability persistency store implementation should be extended with a robust and fast storage mechanism which will replace the MMF store implementation which is known to suffer from robustness issues when a crash occurs. Solution: The new persistency store implementation is based on a key-value storage which makes use of third-party products like Sqlite or Leveldb to store the durable data on disk. |
OSPL-1430/ 7255 |
Durability crashes when starting a new instance of ospl if there is no aligner A fix was applied for this issue n 6.3.0p5. This fix has been removed in 6.3.1 and a new fix will be applied in a subsequent release. Solution: N/A |
OSPL-2131/ 11470 |
Improvement in read performance of certain complex Java CORBA DDS data structures Solution: An option has been added to improve the read performance of certain complex Java CORBA DDS data structures by limiting the overhead caused by JNI invocations during data passing from C to Java. |
OSPL-2244 |
IPv6 interface detection in Windows causes a crash If an application in Windows used the networking service and tried to detect available IPv6 interfaces a system crash occurred. This made IPv6 on Windows unusable. Solution: The implementation of the IPv6 interface detection is improved so that it no longer causes crashes. |
OSPL-2474 |
Default RMI ServiceDiscoveryTimeout needs to be increased. Due to some changes in the default timing for the alignment of historical data, alignment takes a bit longer to start by default. This sometimes causes RMI to fail to locate services as the default time-out of RMI no longer matches the default durability configuration. Even though the discovery period of RMI can be influenced by using the --RMIServiceDiscoveryTimeout= Solution: The default time-out of RMI service discovery has been increased from 10 to 30 seconds. |
OSPL-2504/ 11790 |
Issues with recording and replaying particular data types and content Serialization/Deserialization used by the XML-storage component of the Record and Replay service causes issues if particular character-data is present in DDS samples which are recorded and or replayed. Specifically newline characters and unbounded character sequences containing illegal XML content are not supported. Solution: Relavant parts of the product were changed and the limitations are lifted. |
OSPL-2660 |
The durability service may crash when using dynamic name-spaces. When a new durability service joins the domain, it may introduce a new namespace in the domain that did not exist on the existing nodes in the domain. In this situation an already running durability service may crash when it has a matching durability policy configured for the new name-space (so when using the dynamic name-space feature). In this situation the existing durability service dynamically registers the new name-space as well and applies the configured policies to data that matches that name-space. One of the internal algorithms assumed that all name-spaces for a given durability service are fixed after start-up. Solution: The internal algorithm in the durability service has been changed to be robust against a changing set of name-spaces. |
OSPL-2676/ 11993 |
Deployment guide on OSPL behavioue for signals incorrect. The deployment guide for SIGINT was incorrect in the deployment guide. Additionally, SIGQUIT behaviour was incorrect in the codebase. Solution: Deployment guide and codebase corrected for signals. |
Report ID. | Description |
---|---|
OSPL-1430/ 7255 |
Durability crashes when starting a new instance of ospl if there is no aligner Durability did not check if there was an aligner when dds is started. If there is no aligner and a new nodes starts, then the two nodes will reach a inconsistent state. Solution: Durability will now check if there is a aligner. If durability service cannot find an aligner, then it will send a signal to splicedemon to perform system halt. |
OSPL-2106/2361/ 11859 |
Wrong evaluation of depricated enable_invalid_samples QoS When the enable_invalid_samples Qos is set to false, this QoS is evaluated as true. Solution: The defect in QoS evaluation algorithm is fixed and will now be evaluated correctly. |
OSPL-2126/ 11591 |
Illegal time messages are reported Injecting disposed AND unregistered transient/persistent data into a late joining reader that had its autopurge_disposed_samples_delay set to a very low value (zero or very close to zero) could in some cases corrupt the shared memory because the instance dispose message would already purge the instance before its corresponding unregister message could register itself into it as well. Solution: This loophole in the purging algorithm has now been closed. |
OSPL-2125 |
Durability Memory Leaks Fixed a memory leak during configuration parsing in durability. Additionally fixed reading of an uninitialised value in the service termination thread in the user layer. Solution: Code fixed. |
OSPL-2210/2305 11702 |
Remove level 4 warnings caused by parallel demarshalling on windows Parallel de-marshalling, introduced after V6.3.0 caused some W4 on windows. Solution: Warnings have been resolved. |
OSPL-2541 |
The tuner always displays the value 0 in the SampleSequenceNumber field When reading a sample with the Tuner the field SampleSequenceNumber in the sampleinfo table is always 0. Solution: The value of the SampleSequenceNumber field now displays the correct value |
OSPL-2545/ 11872 |
Durability does not apply merge-policies with other roles after
initial alignment Durability assumed that all federations in the domain would be able to communicate with each other directly and therefore assumed merge policies did not need to be applied during initial start-up. Furthermore, it selected an initial source of alignment independent of role where it should select one with the same role as itself. Solution: Durability now selects an initial aligner with the same role and applies the configured merge policies with other roles immediately after start-up. |
OSPL-2617/ 11894 |
idlpp-generated C-code does not compile with C++ compiler The idlpp-generated code for standalone C contains code that assigns the result of a malloc directly to a character-pointer variable. Even though assigning a void-pointer to any other pointer is allowed in C, it is not in C++ and this makes it impossible to compile the generated C code with a C++ compiler. Solution: The result of a malloc is now casted to a character-pointer before the assignment. |
OSPL-2618/ 11895 |
idlpp-generated functions for the standalone C API don't compile
with .NET 2008 C++ compiler External symbols are used within idlpp-generated functions for the standalone C API preventing the code to compile when using in C++. Solution: External symbols have been moved outside the function. |
Report ID. | Description |
---|---|
OSPL-1697/ 11351 |
Durability periodic report leads to fast growing trace file The trace file is growing as result of a periodic report at the level FINE. Solution: Increase the level of this report to the FINEST level. |
OSPL-1897/ 11066 |
When an invalid handle is applied to read_instance it should
return BAD_PARAMETER instead of PRECONDITION_NOT_MET When read_instance is called with an instance handle that is not associated with an known instance then the operation should return BAD_PARAMETER instead of PRECONDITION_NOT_MET. Solution: When detecting that the provided instance handle is not valid because it does not reference an instance associated with the data reader then the return code is changed to BAD_PARAMETER. |
OSPL-1990/ 11560 |
Cppgen crash under windows when using a path larger than 1024 characters When using idl with c++ generation and a path lager than 1024 characters is used cppgen will crash under windows. Solution: Cppgen is fixed. |
OSPL-2051/ 11575 |
Potential crash while removing synchronous readers and writers A race condition existed in the product which could, in specific circumstances, result in a crash of the spliced process. The crash could occur when a pair of synchronous reader and writer are deleted at roughly the same time. A lock was introduced to protect shared parts of the administration related to synchronous readers and writers. Because of this lock, the race condition can no longer occur. |
OSPL-2134 |
The durability service adds a small delay between handling alignment
requests which may cause a longer alignment time when there are
requests waiting to be served. After handling each alignment (sample) request the durability service adds a small delay. This delay is not necessay when there are still requests waiting to be served. Removing this delay will improve the alignment time. After handling an alignment (sample) request check if there are still request waiting.When there are requests waiting start handling the first request from the waiting list immediately. |
OSPL-2326/ 11766 |
Report plugin gives unhelpful message at runtime if not built correctly If OpenSplice was built without the directive INCLUDE_PLUGGABLE_REPORTING set to 'yes', then enabling the report plugin feature would result in an obscure error message 'ReportPlugin registration failed: -1' and the splice daemon will not start. Solution: The error messages related to this issue have been changed. Now a more explanatory message indicating what happened, and what should be done to resolve the issue, is produced. In particular, the error message will now state that the INCLUDE_PLUGGABLE_REPORTING directive should be set of 'yes' in order to use the pluggable reporting capability of OpenSplice. |
OSPL-2484/ 11819 |
Namespace mismatch in durability service The namespace configuration of the durability service allows catch-all partition expressions and also more specific partition.topic expressions. By mixing these two methods on different nodes in a domain, a durability service could incorrectly determine a topic does not belong to a particular namespace and enter an infinite loop waiting for namespaces from remote nodes that do include the topic. Solution: The matching algorithms in the durability service were made more robust to deal with this situation and now determine the correct namespace for a particular topic. |
OSPL-2510 |
Memory leak when using CReader.read and CReader.take The CReader.read and CReader.take leaked memory by creating a new readCondition with every call. Solution: The creation of the readCondition is moved to the same location as the creation of the Reader itself. Typically a reader is only created once, now the readCondition is also only created once, which minimizes leakage. |
OSPL-2530 |
RT Networking control port not set on first message The control port is determined when the interface becomes available. This occurs after the first message is initialized. In a single process configuration this may cause that the ACK messages are sent to the wrong control port. Solution: When the send channel is notified that the interface has become available set the control port into the current write buffer in case of a reliable channel. |
Report ID. | Description |
---|---|
OSPL-2410/ 11793 |
Potential inconsistency of builtin-data when nodes are concurrently leaving and joining a domain An issue in the product could result in inconsistent state related to a node that has left a domain, between nodes still present in the domain. The problem occurred when a node leaves the domain and a new node joins a domain before all remaining nodes are aware of the node that left. Solution: The product was modified w.r.t. processing of builtin- data, to properly handle this situation. |
OSPL-2409/ 11792 |
Durability never reaches the state 'operational' in a scenario
containing two or more nodes without the explicit configuration of
a namespace for the builtin topics. When the durability configuration file does not configure a namespace for the builtin topics, a namespace called AutoBuiltinTopics should be created automatically. This namespace should be responsible for aligning various builtin topics. Due to a flaw one of the builtin topics, CMParticipantInfo, was not included in the namespace. Also, a policy for the namespace was not provided. As a consequence the nodes initially would try to align their namespaces, but would never reach a 'complete' state because the CMParticipantInfo topic could never be aligned. This causes the system to indefinitely try to align the namespace for this topics, and never reach the 'operational' state. Solution: The CMParticipantInfo has been added to AutoBuiltinTopics namespace. Also a policy for this namespace has been added. Now durability will reach the 'operational' state in a scenario containing two or more nodes without the explicit configuration of a namespace for the builtin topics. |
OSPL-2408/ 11791 |
Enhanced bind behaviour control By allowing more control over the behaviour of the RTnetworking services regarding the bind-address and port reuse, more advanced deployments can be supported. Two attributes have been added to the General/NetworkInterfaceAddress section of the RTnetworking service: 'bind' and 'allowReuse'. The 'bind' attribute controls whether the networking service binds to the wildcard-address or ('any'), or to the NetworkInterfaceAddress ('strict'). The boolean 'allowReuse' attribute specifies whether the SO_REUSEADDR option is specified before binding a socket. (Note: The deployment manual will be updated at a later release regarding these options. The configurator tool can be used to configure the options). |
Report ID. | Description |
---|---|
OSPL-2150/ 11428 |
When a disconnect occurs samples are marked as NOT_ALIVE. After a
reconnect the samples should become ALIVE again, but this does not
occur for existing datareaders in the case where the
BY_RECEPTION_TIME policy is used. In the case where a reader and a writer are running on different nodes and the nodes are disconnected, then samples at the reader are marked as NOT_ALIVE. When the connection is re-established and durability is configured to re-align the lost samples, the reader is expected to show the samples again. Because the unregistration of a sample is treated as a new sample and this sample has the policy BY_RECEPTION_TIME, the unregistration is treated as a new sample that needs to be aligned by the durability. Consequently, the unregistration incorrectly reverts the aliveness of the sample. Additionally, the process to determine a new master durability service is not taking into account the proper time-outs causing temporary conflicting masters and that results in superfluous alignment of data to take place. Solution: The durability service is changed. It now prevents to process updates of samples from writers with your own ID. This means that you do not process data from anybody that tells something about your data. The durability service now also takes into account the proper heartbeat expiry-time when determining a new master to prevent superfluous alignment of data. As a consequence the default heartbeat time-out has changed to prevent alignment in default configurations to take longer. The default expiry-time is now 4 seconds instead of 10. The deployment guide has not been updated yet to reflect this change. |
OSPL-2219 |
Restart failure action on networking service causes communication error When the restart failure action on the networking service is enabled and the networking service is restarted. Communication errors can be observed with the new networking service. Solution: The defect in the service restart mechanism is now fixed and the networking service will correctly communicate. |
OSPL-2220 |
Memory leak of the database type v_message<kernelModule::v_participantInfo> In the case that a restart failure action is enabled on a service it can be observed that the database object count of v_message<kernelModule::v_participantInfo> rapidly increases. Solution: The defect in the service restart mechanism is now fixed and the v_message<kernelModule::v_participantInfo> will not leak anymore. |
OSPL-2315/ 11768 |
C++ ping examples does not communicate with Java pong example The C++ ping application does not initialise its string parameter leading to errors in the Java pong application. Solution: The C++ ping example has been updated to initialise all sample data before writing it. |
OSPL-2317/ 11767 |
Durability service issue with multiple namespaces of a remote node There was an issue with the durability service's management of a remote node's namespaces. It may incorrectly determine a remote namespace already exists and so it may not be added to the node's administration. Solution: The issue has been corrected by comparing the namespace names that are offered by a remote node rather than comparing the namespace properties as was being done before. |
OSPL-2327/ 11774 |
idlpp compiler generating non-compilable code with '-l isocpp'
from IDL containing structs that aren't topics The idlpp compiler was generating non-compilable code with '-l isocpp' or '-l isoc++' from IDL containing structs that aren't topics. The code did not compile because REGISTER_TYPE_TRAITS entries were being generated for these structs in error. Solution: The idlpp compiler has been fixed to not generate REGISTER_TOPIC_TRAITS entries for non-Topic types; such definitions now generate compilable code. |
OSPL-2431 |
During the alignment process durability performs new, but
superfluous alignments while being busy in an existing alignment. When a node is busy in an alignment action and it is triggered to align the durability service will perform this action. This is not always necessary and causes alignment data to flood the network. Solution: Superfluous alignment request during an existing alignment are ignored. This prevents flooding the network. |
Report ID. | Description |
---|---|
OSPL-1478/ 11042 |
idlpp OSPL_BOUNDS_CHECK NULL detection error log incorrect idlpp generated incorrect tracing for OSPL_BOUNDS_CHECKing. When a variable was initialized to NULL the BOUNDS check reported an out of range instead of NULL Solution: Updated idlpp so it logs the correct message when NULL is detected. |
OSPL-1673/ 11578 |
DDSI2 sometimes logs socket error 10035 on Windows Socket blocking and waiting behaviour on Windows does not really support waiting for packets to arrive on a number of sockets in one thread, while trying to send packets in blocking mode on another thread, as the socket is either blocking or non-blocking. The auxiliary data transmission used a socket that was also in use for receiving data. Consequently, once the socket send buffer filled up, the Windows kernel could return error 10035, EWOULDBLOCK. In such a case the packet would be dropped. The protocol is such that these lost packets would not affect correctness, but it does impact timing and is generally undesirable. Solution: All outgoing traffic now uses dedicated transmit sockets. |
OSPL-2026/ 11571 |
Windows Service access rights Application running as normal user is unable to create a participant or topic when running OpenSplice as a service on Windows. Solution: The communication pipe for the service thread was created using default access rights, which blocks writes from low-privileged users. This rights have been changed so that every user can write to the pipe. |
OSPL-2123 |
DDSI2E mapping of transport_priority to channel incorrect DDSI2E internally maps transport priorities to channels for processing protocol messages, retransmits and incoming data on the threads that best correspond to the priority of the message. This mapping differed from the mapping of samples to channels which is done internally by the kernel. In consequence, processing could take place on a different thread than intended by the configuration. This only affected scheduling, especially under very high CPU loads. In the particular case that channels were defined in order of descending priority, the mapping was always correct. Solution: The mapping function has been fixed. |
OSPL-2244 |
IPv6 interface detection in Windows causes a crash If an application in Windows used the networking service and tried to detect available IPv6 interfaces a system crash could occur. Solution: The implementation of the IPv6 interface detection is improved so that it no longer causes crashes. |
OSPL-2257 |
Broader use of empty-String interning in Java copy routines Solution: The empty string interning optimisation for the Java copy routines has been applied to bounded strings as well, improving performance when the data includes a lot of empty bounded strings. |
Report ID. | Description |
---|---|
OSPL-28/ 4767 |
Wrong returncode in register_type API call When calling the API register_type function with a type name that is already registered but with a different metadescriptor retcode DDS_RETCODE_OK is returned. This should be DDS_RETCODE_PRECONDITION_NOT_MET. Solution: The defect in the register_type function is solved and the correct returncode is returned. |
OSPL-148 |
Improve Windows/WindowsCE condition variable implementation The Windows and Windows CE condition variable implementation contained a bug where if there were no threads waiting on the condition variable, an open handle from a semaphore would not be closed when returning from the function. Solution: The condition variable signal algorithm has been improved to test whether there are any waiting threads before opening the handle. This resolves the bug and improves performance slightly because it avoids opening the semaphore when it is not required. |
OSPL-528/ 9909 |
Memory leak in create_querycondition Creating and deleting a QueryCondition leads to a memory leak. Solution: The memory leak has been fixed. |
OSPL-885/1645 |
C# idlpp crash when no module given in IDL When defining idl for C# and a structure is defined without a module the idlpp compiler crashes. Solution: The defect in C# Idlpp is now fixed and it is now possible to define structures without a module tag. |
OSPL-1046 |
Data send on a particular network partition should not be received
by DDS instances that are not connected to that network partition When network partitions are configured and data is sent on a specific partition (not the default partition) then other DDS instances should not receive this data when they are not connected to that specific partition. Currently when networking receives data from a partition which is not connected or unknown then networking delivers this data in the default partition. Solution: When networking receives data on a network partititon that is either not connected or unknown to that networking instance then the data should be dropped. |
OSPL-1187 |
Removal of Sun Code snippet and license acknowledgement OpenSplice Tuner and Tester used a code snippet from Sun that had a license accreditation. Solution: Removal of the code means license acknowledgement is removed. |
OSPL-479/OSPL-1435/ 10963 |
DDSI2 support for IPV6 Solution: Support is added for IPV6 on DDSI2. |
OSPL-1438/ 10974 |
Visual studio limitation on large topic structure definitions For very large topic structure definitions, the Visual Studio compiler runs into a limitation of the maximum length of a string. If the metaDescriptor character string data exceeds 64k in size, the Visual Studio C++ compiler fails to build the generated code. Solution: The metaDescriptor string is replaced by an array which resolves the maximum string limitation in Visual Studio. |
OSPL-1526 |
Try automatic repairing option of the configurator did not work for number values When loading a config file into the configurator with faulty number values the configurator asks if it should repair those faulty values. After this is done the corrected values are not written to the config file when this is being saved. Solution: The defect in the configurator is now fixed and the corrected values are now being stored. |
OSPL-1537/ 11089 |
spliced may crash after a service with the "systemhalt" failure action dies If a service dies it may not have performed a detach from the OpenSplice kernel/database and so the attached services count may be incorrect. That could lead to a crash of a database thread because the spliced may get detached too early. Solution: When spliced detects that a service has died, it ensures that attached services count is correctly maintained, so such a crash cannot occur. |
OSPL-1549/ 11096 |
Durability report about name-space backup unclear When the durability service on start-up detects that the current set of persistent data on disk is not complete due to the fact that the service did not manage to fully complete the alignment of the set of persistent data during the previous run, it will check if there is an older but complete set still available on disk. If so it will replace the newer but incomplete set with the older but complete set after which durability services in the domain will determine who has the latest complete set and use that one everywhere. In case there locally is no older complete set, the error is reported. The report is not deemed very clear though and besides that it is not considered an error but merely a warning. Alignment will still be able to continue. Solution: The report has been rewritten to make situation clearer and is now reported as warning instead of as error. |
OSPL-1591/ 11165 |
Services do not use the shared memory threshold correctly When the shared memory database is configured to have a threshold, the services are entitled to use half of that region meaning they are able to continue to run when shared memory gets low. The issue was that the services were in fact using that threshold region in the same way as regular applications - i.e. they were unable to use any of that region. There was also a bug where terminating services with a failureAction of "kill" were being left in a zombied state and not necessarily exiting. Solution: A fix has been applied that allows the services to correctly use up to 50% of the threshold region. A fix has also been applied to the monitoring of the died/terminating services so that they cannot be left in a zombied state. |
OSPL-1615 |
Durability service does not reach COMPLETE state when some
partition-topic combinations are not covered by the name-space configuration In case the durability service is configured with name-space settings that do not cover all locally available partition-topic combinations, the service did not reach the COMPLETE state that indicates that all partition-topic combinations that the durability service manages have been fully aligned. Solution: The state of locally available partition-topic combinations that are not supposed to be managed by the durability service are no longer included in determining whether or not the alignment is complete. |
OSPL-1634 |
New command line parameter for the configurator The OpenSplice Configuration Editor now contains a new command line instruction.
|
OSPL-1642/ 11224 |
C++ memory leaks of the default QoS sets stored for the DDS Entities The statically initialized default QoS set for each DDS Entity (Publisher, Data Writer etc) are stored as pointer types and they are not deallocated, so they will appear as a leak when main exits. Solution: Storing these default QoS sets as _var types means that the container class takes care of the deallocation, even for the statically initialized types. |
OSPL-1644 |
Inappropriate warning when retrieving network interface information on Windows When a network interface is present but not connected or configured then a warning is given which is not appropriate. Solution: The discovery of the network interfaces should only consider network interfaces that are available and the warning is now only output when required. |
OSPL-1654 |
Networking crashes when a network partition is configured without an explicit name Networking crashes when a network partition is specified without a name. The name attribute of a network partition is optional, when not specified the address should be set to the address of the network partition which is omitted. Solution: When the name of a network partition is not set the name is set to the address attribute of the network partition. |
OSPL-1668 |
Dispose/writedispose action returns error in the Tuner When creating a reader/writer in the Tuner and then clicking on the Dispose or WriteDispose button an error message may be shown in the status pane stating that the dispose failed while the dispose from system-perspective was successful. Solution: The algorithm for showing this message is fixed and the message will not be displayed anymore. |
OSPL-1708 |
Default MMF persistency configuration settings don't work i.c.w
single-process mode When selecting MMF as persistency and running in single-process mode, the durability service would not start due to wrong default settings for mapping address and size of the store. Solution: The default settings are changed in case of single-process mode to ensure proper functioning of the durability service in that deployment mode as well. |
OSPL-1751 |
DDSI2 configuration defaults incompatible with various locales When starting DDSI2 in a locale in which the decimal separator character was different from "." (for example, the French locale, in which it is ","), some DDSI2 default values would be flagged as erroneous, preventing DDSI2 from starting. Solution: DDSI2 has been modified to avoid this problem. |
OSPL-1806/ 11373 |
Tuner crash when trying to read samples which contain a lifespan
that is not set to infinite The Tuner crashes when trying to read samples which have a limited lifespan set. Solution: The defect is now fixed and the Tuner will not crash. |
OSPL-1889 |
An IDL file that contains a typedef to an enum causes problems in Java. If the IDL contains a typedef to an enum, then the middleware should unwind the typedef to retrieve the definition of the enum. This was not done properly, which effectively caused the system to crash. To avoid this, customers had to rewrite their IDL in such a way that it did not contain any typedef to enums. Solution: The implementation of the IDL processing function for Java has been modified so that the issue does not cause crashes anymore in Java. |
OSPL-1913/ 11463 |
Issue performing dispose or writedispose as last operation in coherent set The completion of a coherent set is signaled to the DataReader side by way of sending a new sample that contains the transaction information. That sample is created as a clone from the last sample sent from within the set. That sample was not being stored correctly in the case of a dispose or writedispose so was not being sent. Solution: The algorithm has been corrected so that any form of a writing a sample will store the sample so it can be resent when the transaction is completed. |
OSPL-2000/ 11528 |
The wait_for_historical_data_w_condition() call on the DataReader
is not working properly The wait_for_historical_data_w_condition can return BAD_PARAMETER even when all input parameters are correct due to the fact the durability service is not ready to receive the request from the DataReader at the time it was issued. Furthermore, the durability service may also fail to deliver all historical data that matches the condition if the condition contained a filter expression. Solution: The wait_for_historical_data_w_condition algorithm now waits until all configured durability services are operational before issuing the request. If none have been configured or one or more that have been configured are not operational before the given time-out expires, PRECONDITION_NOT_MET is returned. Furthermore, the internal algorithm to evaluate the filter expression has been modified to ensure correct evaluation of the expression plus parameters against the set of available historical data. |
OSPL-2004 |
Exception handling is broken on POSIX Even though the application signal handler is properly called from the thread that caused synchronous signal, it is ALSO called from within the signalHandlerThread before that. It is obviously not supposed to do that. Solution: Modified POSIX signal handler to not invoke application signal handler from within our dedicated signal handler thread in case of a synchronous signal that was raised from within the application itself. |
OSPL-2010 |
Wireshark on windows Wireshark for RT networking was extremely difficult to build by a user on Windows platforms. Solution: ADLINK has decided to provide this now pre-built on its website. We continue to leave the source code and build files in place, should a user want to use them. |
OSPL-2052/ 11569 |
Incorrect dependencies for dcpssaj.jar The manifest file of the dcpssaj.jar file contains a lot of dependencies which are not needed. Solution: The unrequired dependencies are now removed. |
Report ID. | Description |
---|---|
OSPL-2110/ 11592 |
On Windows the disconnection or connection of a network interface cable
causes large memory loss A status change of the network interface triggers an event. A memory leak occurs when handling this event. Because the status of the event is not automatically reset the function to handle event is called repeatedly. Solution: The memory leak is removed and the network interface status change event notification is re-enabled to allow new network interface status changes to be detected. |
Report ID. | Description |
---|---|
OSPL-948/ 10679 |
DDSI2 sends duplicate samples for writers that publish in multiple
partitions simultaneously A write in multiple partitions simultaneously is received by DDSI2 as a number of messages from a single writer, one for each partition. All these message were forwarded by DDSI2 and filtered out by the subscribers running OpenSplice. However, this required extra bandwidth and besides caused other vendors' implementations of the DDSI protocol to report them as independent updates. Solution: This behaviour has now been changed, so that DDSI2 filters out such duplicates in the vast majority of cases. In particular, as long as the writer, its local subscribers and the networking services do not run into resource limits, it is guaranteed to filter out all duplicates. The filter can, under very unlikely circumstances, conclude a message is a duplicate when in reality it is not. This requires publishing more than 4 billion samples with a single writer while carefully controlling the behaviour of the writer history cache using resource limits on local readers; a situation no real system is likely to ever encounter. We advise that on systems that may run into this, the Unsupported/ForwardAllMessages setting be set to true. |
OSPL-966/ 10707 |
The RTSM tool is no longer working As a protection mechanism the RTSM tool does not continue processing when it thinks that a given address is not valid. One of the steps it takes to determine whether an address is valid, is to check whether the address lies in the virtual memory range that is expected based on the configured size of the shared memory segment. However, the tool determined the size of a shared memory segment in the wrong manner. Due to this issue the tool reports perfectly valid addresses as 'faulty' and no further processing is done. Solution: The algorithm that calculates the size of the shared memory and correct range for addresses has been repaired to ensure valid addresses are no longer reported as 'faulty' and all further processing can be done. |
OSPL-991/ 10735 OSPL-1771/ 11395 |
take() and take_w_condition() do not have the same behaviour/random
crash on take next instance The dispose_all_data operation on the topic was not treated identically to sending a separate dispose message for every entity. This manifested itself especially in the way the disposed data was delivered to late joiners (which sometimes couldn't see that the data was actually disposed) and in the way disposed data ignored the cleanup delay specified on the durability service. Solution: The new implementation of dispose_all_data is much more inline with sending separate dispose messages for every instance, and thus late joiners will always see the correct instance state. Furthermore the dispose messages obey the same cleanup delays as normal dispose messages. However, the dispose_all_data function will still be me more efficient with respect to the utilization of network network bandwitdth, CPU cycles and memory than the manual transmission of separate dispose messages. |
OSPL-1205/ 10845 |
Terminating applications report pthread_join failed with error 35 When an application with a defined exithandler terminates and this exithandler contains an exit call the ospl-error file will report the message: pthread_join failed with error 35. Solution: The defect in the OpenSplice signalhandling is fixed and this error will not be reported anymore. |
OSPL-1334/ 10898 OSPL-1335/ 10899 |
PurgeList may illegally remove a groupInstance. The purgeLists are sometimes populated by the same instance multiple times (for different generations). Although the purgeList expiry algorithm should handle these situations correctly, it seems some scenario's are not properly handled yet, and a groupInstance from the emptyPurgeList may be freed while it is already reincarnated as a new generation. Solution: New code has been added that prevents outdated generations from being inserted into the emptyPurgeList, thus preventing this list from having duplicate entries and thus preventing it from deleting a groupInstance that is already reincarnated. |
OSPL-1341 / 10914 OSPL-1930 / 11473 |
The reliable network communication may not operate correctly when the
first messages of a sending node arrive out of order. When reconnection is enabled and when the first messages of another starting node arrive out of order and also the discovery heartbeats arrive later then the first message the reliable network channel will not operate correctly. Solution: The notification that a node has become alive by the discovery protocol should not reinitialize the reliable channel administration associated with that node when the reliable channel had already detected that the node was alive. |
OSPL-1445/ 10981 |
Possible deadlock when using the API find_topic function When in a multithreaded application, one thread uses the API find_topic function while the topic is not defined and the timeout is set, and another thread executes a function that needs the domain participant, the last thread will get blocked because the API find_topic function locks the domain participant and does not release it until the topic is found. Solution: The defect in the find_topic API function is now fixed and no deadlock will occur. |
OSPL-1495/ 11040 |
In DBMSConnect the mapping from DDS unsigned types to database record
fields is not correct When a topic contains unsigned integer fields the mapping to the corresponding database schema is not correct. This means that the data written from DDS to database is not equal to the data that is stored in the database. This mapping is dependent on the DBMS used. For example MySQL supports unsigned integer fields while Microsoft SQL does not. Solution: The mapping of the DDS Topic fields to the corresponding database schema should be made dependent on the type of DBMS that is used. This also applies to the reading and writing of the database records. |
OSPL-1557/ 10734 |
Crash of durability during initial alignment Unregistrations that are aligned by the durability service are stored without any data to reduce footprint. These unregistrations can only be re-published locally when an instance handle is provided by the durability service while doing so. In some scenario's, the durability service did not use an instance handle while doing so, which made the service crash while locally republishing an aligned unregistration. Solution: The durability service now ensures an instance handle is created in any case while also making sure a registration is created in the instance for the writer that originally wrote the aligned sample. |
OSPL-1512/ 11066 |
PRECONDITION_NOT_MET doc update The reference manuals did not correctly document all cases where PRECONDITION_NOT_MET could be returned. Solution: Manuals updated. |
OSPL-1595/ 11664 |
Not receiving data locally when networking is configured but network
interface is disconnected When networking does not find an suitable and available network interface networking will terminate. Because the configuration specifies networking to be present the sending of data by a publisher will be affected because it will wait until the network service becomes available. Solution: When networking finds a suitable network interface but the interface is currently not connected then networking has to continue and monitor the status of the networking interface to become available and notify the system that it is available. |
OSPL-1669 |
Merged historical data is not delivered to active datareaders. When configured for merging after re-connecting, the durability service does not deliver the aligned samples to existing data-readers. Only newly created data-readers from that point forward will get the data. This is caused by some internal optimisation mechanism between the transient store and existing data-readers. The connection between transient store and existing data-readers can be closed in some situation where remote nodes disconnect and the durability service does not re-instate this connection when the node reconnects. Solution: The connection between transient store and existing readers is now checked and re-instated in case a node reconnects. |
OSPL-1681/ 11284 |
leaseManager reports lease update is behind schedule It was possible that the ospl-info log file reports numerous warnings that the lease is behind schedule. These invalid warnings are comming from the leasemanager that had a fault in the evaluation algorithm of deadline leases which caused these warnings to be displayed while the lease was correct. Solution: The defect in the leasemanager algorithm is now fixed and these invalid warnings will not occur anymore. |
OSPL-1682/ 11291 |
crash of spliced In certain scenario's the reader purgeList was doing invalid memory reads by trying to access already deleted purgeItems, thereby causing potential memory corruption of totally unrelated objects. Solution: The purge algorithm has been modified to prevent this situation, thus preventing the memory from becoming corrupted and improving the overall stability of the system. |
OSPL-1703/ 11354 |
Crash of networking when terminating as result of the reception of a signal. When the network service is terminated as result of a signal then the kernel objects created by networking are freed while some of the networking threads may still try to access these kernel objects. Solution: The termination of the networking threads is synchronised with the freeing of the networking kernel resources. |
OSPL-1729 |
Calling DomainParticipantFactory.set_qos(_) using the Java API segfaults. The underlying JNI layer of the Java DCPS API used the wrong jfieldID when getting a value from Java. Solution: The correct jfieldID is now used. |
OSPL-1816 |
Re-using a Record and Replay storage in consecutive replay scenarios A issue in the Record and Replay service caused unexpected behaviour when a storage is re-used while it is was previously set to pause using the setreplayspeed command. Solution: The issue is resolved so now a storage can be used multiple times during the lifecycle of a Record and Replay service without any constraints. |
OSPL-1852 |
Topic disappearance when topic created in parallel In a scenario where a specific topic was created for the first time in the system, but for which a duplicate was created by another application before the original could enable its topic, a refCount to the resulting topic got dropped and that topic could suddenly disappear while still being used by the system. Solution: A change in the refCounting algorithm has now solved this issue. |
OSPL-1855 |
Synchronous signals sent by external mechanism could cause deadocks. Solution: Signals that are raised by an asynchronous mechanism will now be handled in an asynchronous manner. That means that handling of the signal is delayed until all threads have successfully finished their consultations/modifications of the shared memory, leaving it in a consistent state. Also our signal handler will no longer be installed when an (ignorable) signal is set to SIG_IGN. |
OSPL-1867/ 11443 |
System termination fallback mechanism When a service crashes and system termination is set into progress, a safe system termination is not always guaranteed i.e. the service could end up in a deadlock stalling system termination. This is not acceptable and the system should always terminate. Solution: A new service termination thread is spawned when the system state of a process is set to terminating. This thread waits for 5 seconds to see if the process goes from the terminating state to the terminated state. If this has not happened after 5 seconds the process is being killed by means of _exit. |
OSPL-1923 / 11472 |
Application not detached from shm after deletion of all the
participants. When an application creates 2 or more participants and all these participants are deleted using the delete_participant function, the application still hold a reference to the shared memory. Solution: The defect is now fixed and when all participants from an application are gone the connection to the shared memory is also gone. |
OSPL-1926 / 11518 |
Read of disposed instance may return null for key string fields on java When reading an disposed instance using the java API then it can occur that the returned sample has the key fields not set. This occurs when the topic contains a non key string field which precedes fields that are key fields. Solution: The copy function used when reading a sample has to walk over all fields of a sample and not stop at the first non key string field. |
OSPL-1934 |
Durability PartitionTopic configuration for a NameSpace is not
available in configurator The ability to configure partition-topic combinations for NameSpace contents for the durability service has been added recently. The configurator tool was not updated to include this configuration option. Solution: The configurator tool has been extended with the missing PartitionTopic configuration option. |
OSPL-1936/ 11509 |
Reporting/tracing of networking service could be improved In case the networking service is configured to allow re-connections, it should report when a remote node re-connects. In addition, the networking service should also report the channel when reporting a missing packet in its trace (if configured) as well as report when and if the missing packet is received to be able to find out if the service recovered from the missing packet or not. Solution: The reporting and tracing extensions requested above have been added to the networking service. |
OSPL-1946/ 11512 |
With compression or encryption enabled lost packets may not be resent In some situations networking would try to access the compressed and/or encrypted content of a packet in its resend administration, causing packets to not be re-transmitted. Solution: All information needed for (re-)sending of a packet are read from the packet-buffer before compression and/or encryption. |
OSPL-1952 |
Re-creation of topic definitions from the XML persistent store can
cause application and service to crash The XML persistency implementation of the durability service uses a deprecated (de)serializer to store topic definitions to disk and to recreate them after start-up. When using this particular serializer it is possible that types can be resolved from DDS already by other services or applications before they are finalized. When such an incomplete type is used the process that uses it crashes. Solution: The durability service now uses a new (de)serializer that ensures that types cannot be resolved before being finalised. For backward compatibility, the durability service is still able to read the old serialized format from disk. In that case it still uses the old serializer, but it ensures to write any new definitions using the new serializer and therefore the new format. |
OSPL-1998/ 11540 |
Durability and Memory Consumption When using the durability service in combination with the XML persistent store and the following topic QoS settings: durability_service QosPolicy: history_kind = KEEP_LAST_HISTORY_QOS, history_depth = 1 destination_order QosPolicy: kind = BY_RECEPTION_TIMESTAMP_DESTINATIONORDER_QOS. Memory leakage can be observed. Solution: The defect in durability service is now fixed and in the described scenario data will not leak anymore. |
Report ID. | Description |
---|---|
OSPL-1682/ 11291 |
crash of spliced In certain scenario's the reader purgeList was doing invalid memory reads by trying to access already deleted purgeItems, thereby causing potential memory corruption of totally unrelated objects. Solution: The purge algorithm has been modified to prevent this situation, thus preventing the memory from becoming corrupted and improving the overall stability of the system. |
OSPL-1729 |
Calling DomainParticipantFactory.set_qos(_) using the Java API segfaults. The underlying JNI layer of the Java DCPS API used the wrong jfieldID when getting a value from Java. Solution: The correct jfieldID is now used. |
OSPL-1852 |
Topic disappearance when topic created in parallel In a scenario where a specific topic was created for the first time in the system, but for which a duplicate was created by another application before the original could enable its topic, a refCount to the resulting topic got dropped and that topic could suddenly disappear while still being used by the system. Solution: A change in the refCounting algorithm has now solved this issue. |
OSPL-774 |
New rmipp option to export generated code Solution: A new command line option has been added to the RMI pre-processor to be able to create a Windows DLL from generated code. This option is: -P dll_macro_name[, Only applicable to C and C++. Sets export macro that will be prefixed to all functions in the generated code. This allows creating DLL's from generated code. Optionally a header file can be given that will be included in each generated file. |
OSPL-1076/ 10787 |
Compression in networking is not configurable other than on/off. Data compression typically involves a trade-off between CPU usage and the amount of compression achievable. The utility of the compression feature would be increased by allowing more flexibility in terms of this trade-off. Solution: The compression "level" of the existing zlib compressor may now be configured. Other compression algorithms may also be used. See the Deployment Guide for details. |
OSPL-631/ 10459 |
Using read or take with max_samples limit set can cause some key-values to be never read. The read and take operations return instances starting with lower key-values. If not all available data is read at once (e.g., when having set the max_samples limit) and the lower key-values keep receiving updates, a subsequently performed read operation will return the updated instances, which may prevent the higher key-values to be read. Solution: The read and take operations are changed to provide data circularly from a cursor. This means that these operations will 'resume' a read as if read_next_instance was succesively called. This way all instances can be read even when lower key-values get updated between two read-operations with a max_samples limit. |
OSPL-1023 |
Service failure actions can be taken multiple times When multiple services are known to the service framework, a failure action can be taken multiple times. Solution: The defect in the service failure action algorithm is now fixed and the action will only be done once. |
OSPL-1051 |
OpenSplice RMI Services Activation/De-activation in one call Services activation was done in a per-service basis. A given service becomes active (waits for incoming requests) by calling 'DDS_Service::run(service_name, ...)' which could be either blocking or non-blocking according based on the provided arguments. To activate multiple services, the "run" operation must be called as many times as there are services. Solution: Two new operations has been added to the "CRuntime" class to activate and de-activate all the registered services in one call: - CRuntime::run() - CRuntime::shutdown(bool wait_for_completion = true) The "run" operation is a blocking operation that blocks the calling thread until the shutdown is called. DDS_Service::run operation is kept but it becomes non blocking. It is recommended to use the CRuntime object for services activation and de-activation. |
OSPL-1166/ 10823 |
Terminating the "ospl -f start" operation may not kill all services The algorithm in ospl that performs the shutdown of the splice daemon and its child services in the case of blocking mode did not correctly detect whether the splice daemon had been terminated. This meant that it was possible that the splice daemon and its services may not necessarily have been terminated when "ospl -f start" returns. Solution: The shutdown algorithm in ospl has been improved to correctly detect the termination status of the splice daemon in both normal and blocking modes. |
OSPL-1172/ 10827 |
Performance difference between the datareader listener and subscriber listener In certain scenarios the subscriber listener handling is faster than the datareader listener handling. Solution: The handling algorithm of the datareader listener is improved to match the subscriber handling. |
OSPL-1341/ 10914 |
The reliable network communication may not operate correctly when the
first messages of a sending node arrive out of order. When reconnection is enabled and when the first messages of another starting node arrive out of order and also the discovery heartbeats arrive later then the first message the reliable network channel will not operate correctly. Solution: The notification that a node has become alive by the discovery protocol should not reinitialize the reliable channel administration associated with that node when the reliable channel had already detected that the node was alive. |
OSPL-1384/ 10907 |
OpenSplice logs errors when the XML configuration file contains DOCTYPE descriptors Solution: Thecode that logged the errors has been removed. |
OSPL-1393/ 10907 |
The DDSI2 service uses Watchdog scheduling parameters from RT networking config The DDSI2 service incorrectly used the NetworkService/Watchdog element instead of DDSI2Service/Watchdog element from the OpenSplice configuration file for determining the scheduling parameters of the watchdog thread. Solution: It now uses DDSI2Service/Watchdog. |
OSPL-1051 |
OpenSplice RMI Services Activation/De-activation in one call Services activation was done in a per-service basis. A given service becomes active (waits for incoming requests) by calling 'DDS_Service::run(service_name, ...)' which could be either blocking or non-blocking according based on the provided arguments. To activate multiple services, the "run" operation must be called as many times as there are services. Solution: Two new operations has been added to the "CRuntime" class to activate and de-activate all the registered services in one call: - CRuntime::run() - CRuntime::shutdown(bool wait_for_completion = true) The "run" operation is a blocking operation that blocks the calling thread until the shutdown is called. DDS_Service::run operation is kept but it becomes non blocking. It is recommended to use the CRuntime object for services activation and de-activation. |
OSPL-1478/ 11042 |
BOUNDS_CHECK does not report the name of the incorrect member when the member is null. The copyin routines for C and C++ did not check for null-values in string fields. Solution: Check for null-values in string-fields is added. |
OSPL-1479/ 11041 |
Bugfix to allow correct rebuilding of the C++ APIs ( customlibs ) on 64bit linux systems.. Rebuild would fail due to an incorrect makefile. Solution: Makefile corrected. |
OSPL-1489/ 11042 |
Licenses inconsistent. The $OSPL_HOME/LICENSE file was version 2.6, but the StdLicenseterms2.3.pdf is for 2.3 Solution: PDF corrected to 2.6. |
OSPL-1511 |
Setting the buffer size of the receive socket to the configured value may fail Setting the receive buffer size of the receive socket to the configured value may fail which may cause message loss in case of worst case expected network load. Solution: When setting the receive buffer size of the receive socket to the configured value a warning report is logged if the operating system doesn't apply the setting. |
OSPL-1558 |
Durability sometimes creates conflicting namespace for __BUILT-IN
PARTITION__ when partitionTopic is used in namespace-definition. Durability could automatically create conflicting namespace for __BUILT-IN PARTITION__ when partitionTopic is used. Solution: Durability now only creates a namespace in which the builtin-topics are matched, instead of matching the whole builtin-partition. |
OSPL-1597/ 10974 |
DDSI2 external address setting refuses valid IP addresses The DDSI2 General/ExternalNetworkAddress configuration setting contained an error in validating the specified address, causing valid IP addresses to be rejected. Solution: The validation code has been corrected. |
OSPL-1639 |
The OpenSplice DDS Tuner can not write bounded strings. When a bounded string is written by the OpenSplice DDS Tuner and this value is read back by the system a null reference occurs where the text of the string should have been. Solution: The defect in the OpenSplice DDS Tuner is now fixed and bounded strings are now correctly written to the system. |
OSPL-1621/ 11189 |
Leakage of instances when built in topics not configured. When OpenSplice is configured not to transmit builtin topics the deletion of a datawriter was no longer reported to the rest of the Domain resulting in the leakage of instances that belonged to that datawriter and that were not unregistered explicitly prior to the deletion of that datawriter. Solution: A sample for the DCPSPublication topic is now always sent on the deletion of a datawriter, regardless of the configuration setting for builtin topics. However, this sample is non-transient and will be consumed immediately, so that it does not accumulate unnecessary resources. |
Report ID. | Description |
---|---|
OSPL-626 |
Allow configuration of the UDP port number ranges that networking
may use for the reliable channels for receiving ACK and resend messages Each reliable network channel requires an extra UDP port which other nodes used to send their ACKs and resend messages to. When several single process instances are used each single process requires for each reliable channel an unique UDP port number for this purpose. This enables control over the used UDP port numbers which may be needed when firewall are used. Solution: A configuration element "AllowedPorts" is added to either the networking channels or channel configuration which specifies the range (list) of UDP port numbers available. When specified an available port from the configured port range is used. When not specified the configurated channel port+1 is used when it is avaliable, otherwise a dynamic allocated UDP port is used. (Note: Deployment manual will be updated at 6.2.2 release. Use the configurator tool for further information. |
OSPL-1326 |
Idlpp crashes when two types in different modules have the same name
and one of them has a keylist When an idl is used with 2 or more types in different modules and they have the same name and one or more have a keylist idlpp will crash. Solution: The fault in idlpp has been fixed and it is now save to use multiple types with the same name and a keylist in idl. |
OSPL-1474/ 11036 |
Windows Event Log plug-in failure to initialise The Windows Event Log plug-in could fail with an access violation. This could prevent the OpenSplice daemon starting as a Windows service as this plug-in is configured to be used by default in this circumstance. Solution: The Event Log now initialises correctly. |
OSPL-1488 |
DDSI2 fails to handle keyed topics with 64 members (and some variations) DDSI2 potentially performs some key re-ordering to remove the dependencies between the ordering of keys in the original IDL topic definition, the order of in which the members of the topic are serialised and the order in which the key fields are serialised. An mistake was found in the arithmetic that caused it to refuse topics with more than 64 members, or one with a key inside a nested struct with 32 fields inside one with 33 fields, &c. In such a case, no DDSI DataReaders and DataWriters would be created and no communication would take place for that topic, and the error log showed a generic error message: "handlePublications: new_writer: error -1" or "handleSubscriptions: new_reader: error -1". Solution: It now performs as intended and handles large structs correctly. |
OSPL-1521/ 11075 |
Datareader deletion could cause memory leaks Under certain conditions datareaders would not clean up their resources when becomming deleted. These leaks could occur when readers refuse to connect to writers because of a partition mismatch. Solution: The defect in the cleanup algorithm of datareaders is now fixed and will not leak anymore. |
OSPL-1522/ 11074 |
A reader subscribing to transient data using the "*" (wildcard)
partition when samples are written to multiple partitions may lead to
duplicate samples at the reader When a reader subscribes to a newly created topic/partition combination, because a writer on that partition has just been created, it should also request the historical data associated with combination. The issue was that it was actually being delivered the historical data for all partitions, including those for which samples may have already been delivered and read/taken. Solution: The behaviour has been changed so that historical samples are only delivered for the specific topic/partition that the reader has just subscribed to. |
Report ID. | Description |
---|---|
OSPL-1326 |
idlpp crash when two types in different modules have the same name
and one of them has a keylist When an idl is used with 2 or more types in different modules and they have the same name and one or more have a keylist idlpp will crash. For example; module A { struct Z { long m; }; }; struct Z { long m; }; #pragma keylist Z Solution: The fault in idlpp has been fixed. |
OSPL-1422 |
Lookup_instance leaks memory The DataReader_lookup_instance call leaked one string for topics with string-keys, when a non-existing instance was looked up in a non-empty set. Solution: The leak is fixed. |
OSPL-1424 |
DDSI2 can stop accepting data when receiving fragments of old samples TDDSI2 defragmentation buffers have limited capacity and when full decide what to accept and what to drop on a policy that favours lower sequence numbers of higher ones in case of reliable communication. This policy causes the buffers to fill up slowly when every now and then a subset of the fragments of a sample arrive, but not all of them. This can only happen for "old" samples, as they have will been delivered to the data readers and no retransmits will be requested. Under normal circumstances it is very rare to receive some retransmitted fragments after receiving the full sample, but on networks with long delays, reordering and packet loss, such as a WAN, this problem is quite likely to surface. Solution: The current solution actively drops already accepted data from the defragmentation buffers, preventing uncontrolled build-up of incomplete samples. |
OSPL-1425 |
DDSI2 can request a retransmit of a full sample in addition to a
retransmit of some of its fragments TDDSI2 considers the full state of a proxy writer and its data readers when generating a retransmit requests in response to a writer heartbeat, with the aim of generating the least costly retransmit request for the missing data. However, it can generate a retransmit request of some fragments of a sample while simultaneously requesting a retransmit of the full sample. Solution: The retransmit request for the full sample is now suppressed in cases where a retransmit of some fragments is requested. |
OSPL-1426/ 10961 |
DDSI2 can crash on topic creation On platforms where malloc(0) returns a null pointer, DDSI2 could crash on topic creation. Solution: DDSI2 now explicitly allows for this. |
Report ID. | Description |
---|---|
OSPL-871 |
Incorrect determination of the lifespan The lifespan expiry time is determined when a message is inserted in the reader. If the ReaderLifespanQos is enabled, the lifespan duration is always extracted from the ReaderLifespanQos. Both ReaderLifespanQos and the (inline) LifespanQos of the message should be considered. The earliest expiry time should be the one that is used. Solution: The expiry time determination algorithm has been fixed and will now also consider the (inline) LifespanQos of the message. |
OSPL-997/ 10738 |
ReaderLifespan QoS not rewarded in combination with read with condition When a read with condition is done and the ReaderLifespan QoS is set the Qos is not evaluated. This possibly results in that the reader can return samples that are already expired. Solution: The read with condition algorithm has been fixed and will now check the ReaderLifespan Qos when it is set. |
OSPL-1181 |
Record and Replay Service does not adhere to service lease settings The Domain configuration of OpenSplice contains a section on service leases: a mechanism that monitors the health of OpenSplice services. The RnR service did not correctly register with this mechanism, causing a message to appear in the ospl-info.log informing the user the R&R service has died while actually it was still running. Solution: The service now adheres to the 'Domain/Lease/ExpiryTime' configuration settings and the message is only printed to the OpenSplice info log when the service actually terminates unexpectedly. |
OSPL-1302 |
DDSI2 can miss local discovery events causing issues or blocking behaviour DDSI2 needs to react to the creation of readers, writers and participants on the same node, as well as to the creation of new partition-topic combinations and some housekeeping events. Under adverse timing conditions it may fail to process all events if some are combined into a single event record for efficiency. Solution: It now always processes all events in a single event record. |
OSPL-1324 |
Time-range not applied correctly in REPLAY commands An issue with processing time-ranges in REPLAY-commands caused problems when multiple interest-expressions are used in a single REPLAY-command. Time-ranges would only be applied to the first interest expression, causing data matching other interest-expressions to be replayed without the desired time-range constraints. Solution: Time-ranges are now applied to all interest-expressions in the command. In case the storage also contains data that should be replayed regardless of time-range constraints, a second REPLAY-command can be used. |
OSPL-1392 |
OSPL-XML metadescriptor optimizations The XML metadescriptor, which is used internally to communicate idl types, was using fully-scoped where a relatively scoped name would suffice, and did no optimization on the ordering of types, which potentially results in a lot of unnecessary module open and close tags. Solution: Relatively scoped names are used wherever possible. An algorithm is introduced that orders types based on their module, so the number of module-transitions is minimized. |
OSPL-1392 |
Time-range not applied correctly in REPLAY commands An issue with processing time-ranges in REPLAY-commands caused problems when multiple interest-expressions are used in a single REPLAY-command. Time-ranges would only be applied to the first interest expression, causing data matching other interest-expressions to be replayed without the desired time-range constraints. Solution: Time-ranges are now applied to all interest-expressions in the command. In case the storage also contains data that should be replayed regardless of time-range constraints, a second REPLAY-command can be used. |
OSPL-1395 |
VxWorks distribution incorrectly included duplicated header files Prior to this release the VxWorks distribution erroneously included a second copy of some files from $OSPL_HOME/include/include in the directory $OSPL_HOME/include. This resulted in compilation errors whenever the required OpenSplice product include directories were specified in the order: -I"$(OSPL_HOME)/include" -I"$(OSPL_HOME)/include/sys". Solution: The duplicated files have been removed. Compiler header directory include directives can be specified in any order. |
Report ID. | Description |
---|---|
OSPL-511 |
Default OSPL xml configuration files not available in RTS The xml configuration files suppolied by default with the HDE were not in the RTS installer. Solution: Configuration files standardised across HDE and RTS installers. |
OSPL-532 |
DDSI2 keyhash generation is wrong for some cases DDSI2 always generates the keyhashes included in the DDSI standard (they are spec'd as optional). For topics of which the (big-endian) CDR representation of the key is or may be longer than 16 bytes the keyhash is computed as the MD5 hash of the serialized key.DDSI2 produces an incorrect hash on little-endian platforms for keys containing bounded strings where the total serialized length of the key is >= 17 and <=32 bytes. In practice this affects interoperability when multiple nodes publish the same instance:- between little- and big-endian machines running DDSI2- between little-endian DDSI2 and any other DDSI implementation. Solution: Corrected the keyhash generation. |
OSPL-511 |
OpenSplice Threads should be named To assist users (especially in the single process architecture) to name the threads used by OSPL. Solution: Major OSPL threads have been named. |
OSPL-558 |
Narrow operation in examples leaking The use of the c++ _narrow operation on the created readers and writer entities in some of the OpenSplice examples led to memory leaks. Solution: These leaks have been resolved by better use of the _var helper type when calling the _narrow operation. |
OSPL-593 |
OpenSplice domain service installation as a native windows service At OpenSplice v6.1 running the OpenSplice domain services under the Windows Service Control Manager required the Windows Common Language Runtime. It also required the nomination on installation of a single global log directory that would hold all OpenSplice service and application process logs. When selecting to install the domain as a Windows service an incompatible 'single process mode' OpenSplice XML configuration file was installed by default. Solution: Installing and running OpenSplice services as a Windows Service now uses only unmanaged APIs so the .NET CLR is no longer required. A global log directory is no-longer required or prompted for during product installation. The installer will specify a 'shared memory' service configuration and will include a domain configuration entry to direct the log output from service processes to the Windows Event Log,for instance: <Domain> ... <ReportPlugin> <Library file_name="service_logger" /> <Initialize symbol_name="service_logger_init" /> <TypedReport symbol_name="service_logger_typedreport" /> <Finalize symbol_name="service_logger_shutdown" /> <SuppressDefaultLogs>True</SuppressDefaultLogs> <!-- Change below to 'False' if you wish to log OpenSplice system log events from application proceses to the EventLog also --> <ServicesOnly>True</ServicesOnly> </ReportPlugin> </Domain>See the Deployment Guide for more information regarding log plug-in configuration. Note that by default OpenSplice log events from application processes will still now be written to ospl-error/info.log files in the local directory, just as when not running OpenSplice as a Windows Service, but that this can be changed as per the comment in the snippet above. Note also that this entry can still be added to your OpenSplice domain configuration when OpenSplice is not installed as a Windows Service to direct service (or service and application) log output to the Event Log. |
OSPL-659 |
The C++ using the standalone C DCPS API 'PingPong' example has been removed The C++ using the standalone C DCPS API 'PingPong' example duplicated the code already available in the standalone C DCPS API 'PingPong' example unnecessarily given that the use of a C API from C++ is trivial. Solution: The example has been removed. Users wishing to use the standalone C DCPS API from C++ should refer to the the standalone C examples. These can be compiled with a C++ compiler. |
OSPL-711 / OSPL-788 |
Scheduling parameters for the top level service threads does not work when using single process deployment When applying scheduling(OpenSplice/Domain/Service/Scheduling) properties when using single process deployment the scheduling properties are not applied on the service. Solution: The defect is now fixed and the service thread will get the configured scheduling settings. |
OSPL-775 |
Exceptions 'throw' statements in rmi C++ generated code For consistency and best practice, the C++ exceptions specification has been ignored and instead the IDL to C++ mapping rules followed. Solution: All 'throw' specifications were removed from the RMI C++ generated code. |
OSPL-778 |
DDSI2 does not always deliver data for wildcard-based subscriptions To deliver data from remote writers to subscriptions using partition wildcards, DDSI2 sometimes needs to locally register the actual partition used. However, the optimisation used to avoid doing so unnecessarily did not correctly handle case in which a wildcard reader was matched to writers in different partitions. Solution: Optimisation corrected. |
OSPL-781 / 10369 |
The reliable channel does not operate correctly when using IPv6 on the Windows platform For Windows a bind is performed on the sending socket using the same portnumber as used by the receiving socket. This may cause that ACK messages are not received and that the reliable communication fails. Solution: The bind is not necessary at the sending socket and is removed. |
OSPL-801 |
C API ReaderDataLifecycleQosPolicy QoS policy attribute inconsistency There is an inconsistency between the naming of the ReaderDataLifecycleQosPolicy QoS policy attribute 'invalid_sample_visibility' between the C (SAC) API and other APIs. In the other APIs it is called 'invalid_sample_visibility' and in C (SAC) API it is called 'invalid_samples_visibility' (in plural form). Solution: The ReaderDataLifecycleQosPolicy QoS policy attribute in the C (SAC) API is now also changed to 'invalid_sample_visibility' |
OSPL-807/ 5725 |
Durability resource_limits.max_samples QoSPolicy is not applied properly When a max_samples resource limit is set in the QoS of the Topic, samples could get rejected even though the limitwasn't reached. This was caused by the fact that the samples that overwrote a previous value sometimes resulted in an increasing number of samples in the administration that counted the samples where the counter should have remained at the same value. Solution: The counter for the number of samples is now not increased any longer when the sample overwrites a previous value. |
OSPL-808/ 7136 |
Tuner does not allow to edit array elements When connecting the Tuner and creating a reader/writer to a running system that contains a topic with array elements, it shows that in the Writer tab, the elements of the array can not be modified. Solution: The defect in the Writer tab is now fixed and array elements can be edited now. |
OSPL-809/ 8239 |
Durability does not allow to overrule alignmentKind for built in partition. The durability does not allow to overrule alignmentKind for the name space with the built-in partition. If set to Lazy, durability will create an extra name space with the built-in partition and alignmentKind set to Initial_And_Aligner. Solution: The durability service now no longer overrides the behaviour for the built-in topics, but only if they are disabled in the Domain configuration (//OpenSplice/Domain/BuiltinTopics[@enabled]). |
OSPL-814/ |
When spliced discovers a disconnecting node it only unregisters 1 DataWriter per partition-topic combination When spliced discovers a disconnecting node by its missing heartbeatst, it unregisters all writers belonging to the disconnected node. However, maximally only 1 DataWriter is unregistered for a specific partition-topic combination. If the disconnected node has more than 1 writer attached to the same partition-topic combination, only one of them will be unregistered. Solution: The internal algorithm that takes care of the unregistration of disconnected DataWriters has been changed to ensure all DataWriters of the disconnected node are unregistered. |
OSPL-820 |
ospl tool should ensure the environment is cleaned up before termination when started in blocking mode When the ospl tool is started in blocking mode (ospl -f start), it should ensure that both the shared memory segment and the key-file are deleted before terminating. In the normal case on UNIX-platforms where spliced is requested to stop with a SIGQUIT or SIGTERM, that service will clean up all resources itself. However, when spliced is terminated with a SIGKILL, ospl should take care of the clean-up. Solution: Ospl now ensures that key-file and shm segment are cleaned up when started in blocking mode, even when spliced is killed with a SIGKILL |
OSPL-836 |
DDSI2 fails to report configuration errors in error log DDSI2 configuration error handling lost the ability to write configuration-related error messages to the OpenSplice error log. It instead prints them on stderr which may be noticed on Unix boxes when using "ospl start", but is typically lost on windows. Solution: DDSI2 now reports configuration errors to the error log again. |
OSPL-844 |
Issues with starting OpenSplice Tuner and OpenSplice DCG Setting the SPLICE_TARGET environment variable could result in problems starting the OSPL Tuner and DCG tools. Solution: The SPLICE_TARGET is no longer set by the Tuner and dcg scripts. |
OSPL-846 |
SampleInfo should be extended with reception_timestamp for all language bindings For all language bindings, a new field has been added to the SampleInfo called reception_timestamp. This field represents the local time at which the corresponding sample has been delivered to the reader from which it is being accessed. |
OSPL-854/ 10511 |
read/take returns NO_DATA while the reader has data In case the data sequence and sample info sequence _maximum are set to 0 and the _release flag is set to TRUE, for a read or take call with max samples set to unlimited, The read or take call will always return with code NO_DATA. Solution: The defect in the read/take call is solved and now returns the correct data. |
OSPL-890/ 10558 |
Queried meta descriptor incorrect Sometimes a queried meta-descriptor string seems incomplete and is not equivalent with the original meta-descriptor. Solution: An internal error in the XML meta-descriptor (de)serializer has been repaired to fix incorrect behaviour on types with inter-module dependencies |
OSPL-904 |
Reader returns NO_DATA, while samples are available and not read When using read_next_instance_w_condition where
Solution: The defect in the read function has been fixed and now returns a sample when available. |
OSPL-908 |
Crash when calling DDS_DataReader_create_view in combination with content filtered topics When calling the DDS_DataReader_create_view function and the system contains content filtered topics a crash can occur. Solution: The defect in the DDS_DataReader_create_view call has been fixed and will no longer crash. |
OSPL-924 |
Faulty masks evaluation on dataView The masks evaluation on dataView was faulty due to this queries could continuesly be triggering. Solution: The defect is now fixed and the masks are evaluated properly. |
OSPL-942 |
mmstat maintenance Minor work to improve mmstat
|
OSPL-958 |
User clock module dlopen error in error log file. When a UserClockService is configured the configured library does not load and an error (dlopen error) gets reported in the error log. Solution: The configured library resolving algorithm contained a fault and is now fixed. The library resolving now follows the following rules:
|
OSPL-970 |
Improvements to DDSI2 network interface selection In multi-homed systems it is often necessary to instruct DDSI2 which network interface to use using the General/NetworkInterfaceAddress setting. Until now, this required specifying the exact IP address of the host for that network interface, in essence requiring each host to have its own configuration file. Solution: New options for specifying the network interface have been added:
|
OSPL-977 |
DDSI2 can crash when a remote entity disappears DDSI2 can potentially crash when a remote entity disappears. This requires its lease to expire in parallel with processing an explicit notification of that remote entity being deleted. Solution: A fix is applied to prevent the crash. |
OSPL-1008 |
Applications may deadlock or crash during signal handling Applications may deadlock or crash when a signal is received within internal signal handling algorithm. Solution: The internal signal handling algorithm has been adapted to allow users to set their own signal handler. Furthermore, the algorithm ensures that handling of asynchronous signals is performed in a dedicated thread to prevent deadlocks during clean-up of DDS entities.. |
OSPL-1021 10758 |
Clash with TEST macro from c_mmbase.h The database code referenced a macro named TEST which could possibly clash with other external software and application code. Solution: This has been renamed to a better scoped C_MM_STATS macro. |
OSPL-1033 /10767 |
Reference manuals incorrect for create_particpant
The reference manuals still had the V5 API definition documented for
create_particpant API.
Solution: Reference manuals now correctly state the domainId is an integer. |
OSPL-1035 /10764 |
DDSI2 failure to deserialize bounded strings of maximum length DDSI2 validates all data coming in over the network, but the input validator erroneously considered strings in received network packets of which the length equalled the specified bound as being oversize. Solution: DDSI2 now handles this correctly. |
OSPL-1049 /107678 |
Java application crash on type register Randomly, on type register, a java application can crash with the following notice: malloc(): memory corruption. Solution: The type register memory allocation algorithm for java has been fixed and will no longer crash. |
OSPL-1077 |
The dynamic discovery protocol does not detect all nodes for RT networking. Dynamic discovery protocol makes use of unicast communication to find all the nodes and roles. However the data is sent to the wrong port number which depending on the configuration may mean not all nodes are detected. Solution: When sending point-to-point data on a best-effort channel the destination port should be the primary port of the channel. |
OSPL-1085 |
Secure Networking parsing of Credentials tag in XML configuration fails Solution: The spelling mistakes in the secure network configuration files have been corrected to be "Credentials" rather than "Credentails". |
OSPL-1115 |
Specific configuration may cause a crash of secure RT networking. When the configuration of secure networking does not contain specific elements then secure networking may crash because it frees un-allocated memory. Solution: Check if configuration elements are allocated or not. |
OSPL-1125 |
The OpenSplice installer no longer uses Windows SDK Global Assembly
Cache Utility (gacutil.exe) to install the C# binding assembly. Previously, selecting an option when running the OpenSplice installer used the Windows SDK Global Assembly Cache Utility (gacutil.exe) to install the C# binding assembly into the Global Assembly Cache. The MSDN documentation says the following regarding this tool: "In deployment scenarios, use Windows Installer 2.0 to install assemblies into the global assembly cache. Use the Global Assembly Cache tool only in development scenarios, because it does not provide assembly reference counting and other features provided when using the Windows Installer." Solution: The installer option of using the gacutil.exe to install the assembly has been removed. If users wish to install the C# binding assembly into the Global Assembly Cache for development purposes they may use the gacutil.exe themselves to do this. Instructions for doing this are included in the Deployment Guide. Users should use a suitable approved method to install the assembly in deployment scenarios, if required. |
OSPL-1127 |
The supported Access Control Module for Secure RT networking is
identified as "MAC" which is not correct in the implementation. In the configuration the currently supported Access Control Module is MAC. In the implementation an other name is used. Solution: The name of the supported Access Control Module is changed to "MAC". |
OSPL-1131 |
DDS_string_dup method added to the SAC DCPS mapping The DCPS SAC mapping did not have a convenient function that could be used to duplicate strings. Solution: A DDS_string_dup function has been added to the SAC DCPS mapping. Its signature is
DDS_char* DDS_string_dup (const DDS_char* src); .
The memory allocated must be freed using DDS_free().
|
OSPL-1148 |
end_coherent_changes() may crash if the publisher contains a writer that
did not write any samples in that coherent update There was an issue where calling end_coherent_changes() on a publisher may cause a crash if that publisher contained a data writer that had not written any samples as part of that coherent update. The algorithm assumed that every writer belonging to a coherent publisher would write samples within each coherent update. Solution: The algorithm in end_coherent_changes() has been fixed to also support data writers that were not active during that coherent update. |
OSPL-1156 |
The Secure RT Networking security policy file is not parsed correctly. The security profile should contain a classification element in the resource section. This element is not parsed correctly. Solution: Add the parsing of the classification element to the security profile parser. |
OSPL-1157 |
Secure RT networking may drop messages when authorization is enabled.. The secure RT networking implementation uses libcrypto. Further secure networking is multi threaded which may cause that multiple threads may access this library concurrently which causes that the checking of the RSA signature fails. Solution: To operate libcrypto using concurrent threads callbacks are implemented which provide the locking mechanism. |
OSPL-1158 |
DDSI2 could fail to provide historical transient local data in
configurations without durability In OpenSplice delivering historical data to new non-volatile readers is normally handled by the OpenSplice kernel and durability service. However, the DDSI specification prescribes an implementation of transient-local data that DDSI2 adheres to, in allowing many minimum-profile application to run on OpenSplice without a durability service present. In this configuration, DDSI2 does not always provide the historical data, instead discarding a sample as if it were a notification of a lost sample. Solution: This issue has been fixed by correctly setting the relevant data. |
OSPL-1310 |
Configurator tool does not allow configuring the Report element The //OpenSplice/Domain/Report element is missing in the configurator tool. Solution: The configuration element plus its attributes have been added to the configurator tool. |
OSPL-1312 |
Default Lease ExpiryTime and updateFactor are incorrect in configurator tool The default Lease ExpiryTime is set to 10.0 in the splice daemon but to 20.0 in configurator tool. The updateFactor is set to 0.2 in the splice daemon but to 0.1 in the configurator tool. Solution: The configurator tool has been updated to match splice daemon implementation. |
OSPL-1314 |
Inconsistent linkage of OpenSSL to secure networking services on Windows x86-64 On the above platform the secure networking services were dynamically linking Open SSL requiring the user to install Open SSL in order to use them. This was at odds with the behaviour on all other platforms. Solution: The secure networking services statically link a version of Open SSL on Windows x86-64. The secure networking services will be changed to dynamically link to Open SSL in a future version of of the product. |
Report ID. | Description |
---|---|
OSPL-1017/ 10755 |
Added support for Windriver Linux 4.3 on Freescale MPC8308 Solution: Implementated platform support for windriver linux 4.3 on mpc8308 |
Report ID. | Description |
---|---|
OSPL-1017/ 10755 |
Durability crashes when alignment data contains only unregister messages The 6.1.1p2 release failed on this issue. Solution: Correctly applied patch to resolve the issue. |
OSPL-1092/ 10791 |
Liveliness notifications to data readers created on multiple partitions may not happen If a data reader is created on two partitions, say "A" and "B", then it will only receive liveliness notifications for the partition that is last lexicographically, i.e. only for "B", despite a new data writer being created on partition A.. Solution: The algorithm that determined whether a data reader's partitions match that of a data writer has been fixed to ensure that it supports a data reader with multiple partitions. This means that a notification can now happen for more than just the final partition in the list |
Report ID. | Description |
---|---|
OSPL-1017/ 10755 |
Durability crashes when alignment data contains only unregister messages In the scenario where there are only unregister messages aligned by the durability service for a given partition-topic combination, the service would crash because an instance lookup would fail. Solution: Durability is made robust against this kind of scenario. |
OSPL-1035/ 10764 |
DDSI2 failure to deserialize bounded strings of maximum length DDSI2 validates all data coming in over the network, but the input validator erroneously considered strings in received network packets of which the length equalled the specified bound as being oversized. Solution: The error in the validator has been fixed |
Report ID. | Description |
---|---|
OSPL-892/ 10091 |
DDSI2 can respond with incorrect set fragments to retransmission requests The DDSI protocol allows reliable readers to request retransmission of individual fragments of large samples. DDSI2 could respond with fragments other than the ones requested, which could generally be masked by an eventual request for the full sample to be retransmitted, but would cause the communication to come to a complete standstill if a particular fragment always got lost. For example, when the rapid (re)transmission of a large amount of data systematically lead to a switch throwing away a whole range of packets. Solution: DDSI2 now always responds with the requested fragments. |
OSPL-919 |
C# Examples do not compile on Windows 64 bit platforms The C# Examples do not compile on Windows 64 bit platforms. This was caused by 2 problems: - missing entries in the solution file - bugs in the example source code Solution: The missing entries have been added to the solution file and the bugs fixed in the example source code. |
Report ID. | Description |
---|---|
OSPL-301/ 10110 |
handle.serial exceeds HANDLE_SERIAL_MASK Two different perceptions of the maximum serial number for a handle exist within the product. Once the serial becomes larger than the smallest maximum, the reported error occurs. Solution: The maximum serial number to be used is now to same in both locations in the product. |
OSPL-508/ 9816 |
Idlpp crashes on invalid recursion. Solution: idlpp will check if a type is defined before it is used - except for sequences where recursion is allowed. |
OSPL-635 10466 |
Unable to determine adapter name in windows It was possible that when using windows OpenSplice is unable to determine the network adapter name. Solution: The defect in the name resolving mechanism is fixed according to the MSDN page solution and a proper adapter name is resolved. |
OSPL-637 10466 |
Handling loopback data in native networking When using shared memory with native networking the configuration will never 'see' a co-located single-process instance on the same node. Secondly data that loops back (i.e. 'own sent data') is detected in a sub-optimal way. Solution: The defect in native networking is solved and a new configuration item 'EnableMulticastLoopback' for native networking is introduced to optimize loopback data. EnableMulticastLoopback specifies whether the networking service will allow IP multicast packets within the node to be visible to all networking participants in the node, including itself. It must be TRUE for intra-node multicast communications, but if a node runs only a single OpenSplice networking service and does not host any other networking-capable programs, it may be set to FALSE for improved performance |
9963/ dds3426 / OSPL-646 |
DDSI2 used incorrect encoding for fragment data message headers DDSI2 incorrectly used to generate and interpret fragmented data message headers as if they were slightly extended versions of the non-fragmented data message headers. This caused DDSI2 to be non-compliant with respect to the standard and to fail to interoperate with other vendors' implementations for large samples. Solution: the setting and interpretation has been corrected. This breaks backwards compatibility, but because DDSI2 is still in beta, this does not constitute a change of policy. For those exceptional cases where backwards compatibility is currently an issue, a setting Unsupported/LegacyFragmentation has been introduced, which may be set to true to continue using and interpreting the old message format. |
OSPL-780 |
Incorrect handling of ioctl return code The return result from a call to ioctl should be interpreted as successful if the result is not equal to ERROR. The existing code uses result equal to zero as the criteria for success but positive non-zero values are also valid. Solution: The defect in the abstraction layer is now fixed and positive non-zero values values are now also valid. |
OSPL-783 |
IPv6 DontRoute emulation incorrect Due to lack of support for SO_DONTROUTE on some IPv6 stacks, networking tried to emulate the behaviour by setting the hop-count to 1. This is not functionally equivalent to SO_DONTROUTE. Solution: The emulation of the option has been removed. When set on an IPv6 configured networking service the option will be ignored. |
OSPL-784 |
Erroneous messages logging when sending initial ACKs if multiple AC
messages were bundled in a single packet Erroneous messages were logged when multiple ACK messages were bundled into a single packet in the case of multiple partitons with Partition 1 only has a first ACK to be sent (no pending ACKs) and partition 2 has a first ACK and/or pending ACKs. Solution: The defect in the ACK logging mechanism has been fixed and now logs the correct messages. |
OSPL-813 |
Consistent final value not always guaranteed with BY_SOURCE_TIMESTAMP In the unusual scenario where a single writer updates an instance with the same timestamp, a consistent final value for that instance was not guaranteed across all subscribers. According to the DDS v1.2 spec this should be guaranteed in this case. Solution: When updating the administration of the readers the consistent final value is guaranteed by incorporating a writer-generated sequence number. |
OSPL-821/ 10098 |
DDSI2 socket receive buffer sizes are be too small to handle large packet bursts Large incoming packet bursts could overwhelm the configured buffer capacity of the network sockets in DDSI2 networking service. This was dependent on many factors, in particular also on scheduling latencies at the OS level. Solution: The default receive buffer size is now the minimum of the new Unsupported/MinimumSocketReceiveBufferSize option and the operating system default UDP socket buffer size. The default value of 64kB should suffice for most systems but can be increased where needed. |
OSPL-827 |
liveliness count issue with multiple partitions When using data readers accros multiple partitions the liveliness count increments for all writers instead of only the writers connected to the selected partition. Solution: The defect in the liveliness count algorithm is now fixed and now only the liveliness count for the writers in the selected partition are updated. |
Report ID. | Description |
---|---|
dds3554 |
e500v2 based builds need to include the VX_SPE_TASK option when spawning RTPs. Updated e500v2 based builds to include the VX_SPE_TASK option when spawning RTPs. |
Report ID. | Description |
---|---|
dds3554 |
PowerPC P2020 and PPC32 vxworks 6.8, linux hosted builds now supported. Added PowerPC P2020 and PPC32 vxworks 6.8, linux hosted builds. |
Report ID. | Description |
---|---|
dds3508/ 10342 |
spliced not running on LynxOS 5. Problems with spliced not executing correctly on LynxOS 5. Solution: Multiple fixes required, invalid makesystem options and some extra LynxOS posix support required. |
Report ID. | Description |
---|---|
8840 / 9350 / dds2973 / dds2590 |
Memory leaks during shutdown of native networking. During the shutdown of native networking not all administration structures are freed. Solution: The defect in the network termination algorithm is now fixed and all administration is correctly freed. |
10065 / dds3306 |
DDSI2 may erroneously decide CDR serialized data is invalid The DDSI2 deserializer attempts to avoid allocating memory for sequences of obviously bogus lengths by checking whether the remaining number of bytes is sufficient to encode some sequence of the declared length. Unfortunately, it uses the wrong notion of the size of an element, which can cause it to incorrectly declare a length to be bogus. This is dependent on platform, type and the type and contents of any data preceding the sequence. Solution: Always deserialize all data until the input is really exhausted. |
10064 / dds3339 |
Reliable communication fails with multiple connected RT networking services on one node The statically configured portnumbers collide with multiple instances, causing unicast data not to be received by all RT networking instances. For reliability related packets (ACK's, resends, etc.) unicast is used on a static port (data-port + 1), even when broad- or multicast is configured. This implies that multiple instances try to use the same IP-addres-portnumber combination, which is never possible. This causes only the latest bound RT networking service to receive all ACK's and resends for any instance on the same IP-address-portnumber combination, breaking reliable communication and causing a lot of warnings to be reported regarding unexpected ACK's, etc. Solution: RT networking uses a dynamically assigned portnumber instead of a statically configured portnumber for the reliability related packets. |
10125 / dds3391 |
OpenSplice DDS configured with multiple native network services and IgnoredPartitions configuration option set. In the case that OpenSplice is used with multiple native networking services and IgnoredPartitions is configurated. Writing multiple instances with a single dataWriter fails. The first instance will be written all further instances will not be written. Solution: The defect in the network administation algorithm in combination with ignored partitions is now fixed. |
9963 / dds3426 dds3478 |
DDSI2 used incorrect encoding for fragment data message headers DDSI2 incorrectly used to generate and interpret fragmented data message headers as if they were slightly extended versions of the non-fragmented data message headers. This caused DDSI2 to be non-compliant with respect to the standard and to fail to interoperate with other vendors' implementations for large samples. Solution: The setting and interpretation has been corrected. This breaks backwards compatibility. For those exceptional cases where backwards compatibility is currently an issue, a setting Unsupported/LegacyFragmentation has been introduced, which may be set to true to continue using and interpreting the old message format. |
Report ID. | Description |
---|---|
9963 / dds3426 dds3478 |
DDSI2 fragment size is now configurable DDSI2 never creates data messages containing a payload larger than the FragmentSize, any sample larger than the FragmentSize gets split into multiple fragments of FragmentSize each. These fragments are then transported independently (but may yet be merged into larger UDP datagrams). Solution: This size is now configurable using Unsupported/FragmentSize, with a default of 1280 bytes. Values below 1025 bytes violate the DDSI2 specification, above approximately 65000 bytes it (probably) won't fit inside a single UDP datagram. Increasing the size will shift more fragmenting and reassembling to the IP stack, which is generally more efficient because it is done inside the network stack, but which is incapable of retransmitting individual lost fragments. Increasing it may also allow operating without any fragmenting at the DDSI level, which may help avoid interoperability issues. |
dds3486 |
Third party licenses updates OpenSplice Tester third party tool licenses were not documented in release notes. Solution: Updated docs/html/third_party_licenses.html. |
Report ID. | Description |
---|---|
dds3467/ 10259 |
Signal handling failing on DENX. Signals were not being correctly handled for the DENX target Solution: Use the generic POSIX implementation. |
dds3217/9881 dds3430 |
Spliced shared memory leak on remote durability shutdown A memory leak was found whenever a remote node was started and then stopped while durability was also running. Solution: When durability detected a remote node, it would write an update into the system (and thus the shared memory on the local node). The logic for this write wrongly performed a double string creation for the same string. The second string creation overwrote the pointer to the first created string, which resulted in the first created string to be never freed when the remote node disconnected. The solution was to remove the first string creation, as it was superfluous. |
dds3248/ 9908 |
Partition expressions not matched properly. In case an application publishes/subscribes to multiple partitions and the list of partitions has names that are sub-strings of one of the others in the list, some partitions may be ignored. Even though communication within a node and communication over the network using the native networking service still works correctly, this issue causes communication over the network to fail when using the ddsi networking service and it also causes listeners not to be triggered on matched subscriptions and publications. Solution: The error in the pattern matching algorithm in the kernel has been repaired. |
dds3399/ 10139 |
Incorrect network statistics. The networking service optionally keeps track of statistics that can be inspected by means of the Tuner tool. After inspection it turned out that the maxNumberOfPacketsResentToOneNode and maxNumberOfBytesResentToOneNode statistics are showing wrong values. Solution: The maxNumberOfPacketsResentToOneNode and maxNumberOfBytesResentToOneNode statistics are now showing the correct values. |
dds3416/ 10152 |
Manual liveliness not working correctly Specific circumstances, such as a manual liveliness, could trigger a bug in the lease manager that would eventually cause the lease manager thread to hang indefinitely. The result is that periodic leases, such as the sending and receiving of heartbeats, is stopped. Solution: The lease manager was fixed. |
dds3432/ 10187 |
Applications SEGFAULT when the domain is not started. In the case no DDS domain is running but there is still a dirty shared memory segement present from an old instance of Spliced. It can happen that when an application is started, the application can crash with a SEGFAULT. Solution: The defect is fixed and the application will no longer crash because of this. |
dds3455 |
The durability service should improve the memory used during alignment. The durability service temporarily caches received alignment data until the set for a specific partition-topic combination is complete. The algorithm implemented there could be improved to reduce the amount of memory used during this phase. Solution: The durability service now stores unregistrations is a much more memory-efficient way reducing the memory overhead for alignment to a minimum. |
dds3457/10241 dds3458/10242 |
Crash of networking and/or durability during due to memory exhaustion. The networking and durability services could crash when the shared memory was exhausted. Solution: The services now check the available memory threshold and does not claim more memory when the threshold has been reached. Furthermore they will terminate when no more memory becomes available within a few seconds. |
Report ID. | Description |
---|---|
dds3443/ |
Added support for vxWorks 6.8.2 on MIPSI32R2sf There was no port for vxWorks 6.8.2 on mips. Solution: A build for vxworks 6.8.2 on MIPSI32R2sf has been added |
Report ID. | Description |
---|---|
dds3217/ 9881 |
Spliced shared memory leak on remote node shutdown When a remote node comes (e.g., ospl start is executed on a remote node) and then leaves (e.g., ospl stop is executed on a remote node) the system, a structural increase in shared memory is observed. Solution: A memory leak was fixed dealing with heartbeat message of the splice daemon being leaked. And a memory leak was fixed where durability history_kind was wrongly mixed with regular history depth instead of the value of the durability history depth. |
dds3391/ 10125 |
OpenSplice DDS configured with multiple native network services and IgnoredPartitions configuration option set In the case that OpenSplice is used with multiple native networking services and IgnoredPartitions is configured. Writing multiple instances with a single dataWriter fails. The first instance will be written all further instances will not be written. Solution: The network administration algorithm is now taking ignored partitions into account as well. |
Report ID. | Description |
---|---|
dds3240 |
Usability improvement for OpenSplice RMI The rmipp compilation requires a step to also compile its output with idlpp. There is no need for a user to inititate this. Solution: rmipp compilation now directly calls the idlpp step to remove unnecessary user action. |
dds3299 |
Added Support for IPv6 on vxworks 6.7 and 6.8 Added Support for IPv6 on vxworks 6.7 and 6.8, including workarounds for issues with vxworks inet_pton, and inet_ntop ( Windriver TSR#1085626 ) |
Report ID. | Description |
---|---|
dds3348 |
Robustness of secure-networking service against malformed incoming packets The secure networking service should be robust against malformed incoming packets, but certain cases were (although detected) not handled correctly and could lead to a crash of the secure-networking service. Solution: Handling of malformed incoming packets has been fixed, so that they are correctly ignored. |
dds3354/ 10083 |
DBMSConnect - Bounded strings would get mapped by DbmsConnect to a VARCHAR column with width 6000 DBMSConnect ignored the length of a bounded string and treated it as an unbounded string, which is mapped to a SQL'99 VARCHAR column with a width of 6000. Solution: DBMSConnect uses the specified maximum length of a bounded string to determine the appropriate width of a column. |
dds3363/ 10094 |
Installation of the OpenSplice daemon as a Windows service from the Visual Studio 2010 distribution fails with an abort. Installation of the OpenSplice daemon as a Windows service from the Visual Studio 2010 distribution failed with an abort as the installer used an obsolete API. Solution: Installation of the OpenSplice daemon as a Windows service from the Visual Studio 2010 distribution no longer fails with an abort as the installer no longer uses an obsolete API. |
dds3365/ 10098 |
DDSI - Unnecessary packet drops for large "best-effort" messages (>8Kb) when using DDSI2 on windows platform. When receiving "best-effort" messages larger than 8kB via the DDSI2 service on the windows platform, a packetdrop-rate was observed that was higher than expected. This was caused by the UDP receivebuffer size on this platform, that defaults to just 8kB. Solution: An enhancement has been implemented that will set the UDP receivebuffer size to 64kB if the default value of a platform is lower than that. This will ensure that the UDP receivebuffers are big enough for the largest supported packetsize. |
dds3366 |
Shared Memory leaks in OpenSplice RMI In OpenSplice RMI, each request and each reply uses a different instance. These instances were not disposed and unregistered after the request/reply. The reply readers were not taking the samples that were destinated to other clients. These two problems were leading to a shm leak : shm memory consumption growing for each request/reply until the process ended. Solution: After writing the request/reply, the corresponding writer disposes and unregisters the corresponding instance. After taking it's reply, the reply reader takes all samples destinated to other clients. |
Report ID. | Description |
---|---|
dds2723 |
DataReader does not support the TIME_BASED_FILTER QoS policy The time-based filter QoS policy allows for pacing of a DataReader, based on a minimum separation time when receiving samples. Solution: The feature is now implemented and described in the OpenSpliceDDS Reference Manuals. |
dds2784 |
The DomainParticipant find_topic() function blocks for the full timeout period Where the topic of interest in the DomainParticipant find_topic() call did not exist, the find_topic method was blocking for the full timeout period even if the topic became available earlier. Solution: If no matching topic is found on find_topic() call, the function now checks for a matching topic every 10ms until a matching topic is found or the timeout is reached. |
dds2846 |
Issues with multiple domains on Windows A number of problems exist in the Windows abstraction layer that become noticeable if multiple domains are active and the default SHM mapping address is used. File locks are not released causing problems at shutdown and kernel objects (condition events, semaphores) are mixed up between the active domains. Solution: A number of improvements were made to ensure a correct operation of OpenSplice on Windows. |
dds3003 |
Cleanup problem in C++ PSM When an application is terminated while particular entities are still in use by the C++ PSM, these entities are not properly cleaned-up and freed. Depending on the context this could cause a crash or hang during shutdown. Solution: The cleanup routines of the affected entities were changed to take into account additional constraints to ensure a proper cleanup under all circumstances. |
dds3023 |
Crash in writer history cache management Under some circumstances the set of samples belonging to an instance in the writer history cache could become corrupted and cause a crash. Solution: The area of code was reviewed and improved to remove this problem. |
dds3062 |
Synchronization error while shutting down spliced A thread tried to access the spliced user-space object after it was destroyed. Solution: Spliced now waits until the thread has finished before destroying the spliced user-space object. |
dds3125/ 9463 |
Idlpp does not handle -I correctly and does not include the directory
of where the IDL is located idlpp does not handle -I correctly when the paths end with a backslash. This typically happens in Visual Studio, because macros always end with a backslash. Also the include the directory of where the IDL is located is missing from the include path. Solution: Each last backslash from the include path is now removed and the include the directory of where the IDL is located is now added to the include path. |
dds3142 |
Durability crashed if no configuration was provided Because durability did not load the default values when a configuration (URI) was omitted on the commandline, the configuration was corrupt, which crashed the service. Solution: The service now also loads default settings when a configuration is not available. |
dds3147/dds3189/ 9725 |
DCPS C++ TypeSupportFactory object leakage The C++ TypeSupportFactory leaks in case its associated TypeSupport is registered with a DomainParticipant due to a missing decrease of the reference count of that object in the API itself when the DomainParticipant is deleted. Solution: The reference count of the object is now correctly decreased when a DomainParticipant, that has the type registered, is deleted. |
dds3148/ 9726 |
Maximum UDP payload size could not be configured DDSI2 used a default maximum payload size for the payload of UDP packets which could cause issues on networks that don't support fragmentation of UDP frames. Solution: A configuration option has been added that allows the maximum payload size to be configured. The DDSI2Service/Unsupported/MaxMessageSize element specifies the maximum size of the UDP payload ([def] 4096) that will be used by DDSI2. DDSI2 will maintain this limit within the bounds of the DDSI specification. This currently means that even though MaxMessageSize is set below 1192, messages with a payload of up to 1192 may still be observed. |
dds3154/ 9756 |
Heap memory leakage due to Java based query conditions. The copy routines for a sequence of strings for the DCPS Java API contained a memory leak due to the fact a release flag wasn't set to true. Solution: The release flag has been set to true. |
dds3156/ 9753 |
Reliablility problem in networking when running multiple nodes. When running real time networking and nodes are communicating over a reliable channel, it is possible that when a node is shutdown other nodes in the network start to lose communication with each other due to a corruption in the network administration. Solution: The defect in the network administation algorithm is now fixed. |
dds3168/ 9771 |
Destruction of ErrorInfo in C++ could cause crash. Due to a bug in the C++ language binding for the (unsupported) ErrorInfo API a crash could occur. Solution: The bug has been resolved, allowing proper destruction of ErrorInfo objects in the C++ language binding. |
dds3221 |
Using hostname as address of a NetworkPartition fails When a valid or invalid hostname is used as the address of a NetworkPartition of the network service, the error 'ignoring invalid networking address ' is reported and no communication takes place. Solution: The defect was in the hostname resolving algorithm and is now fixed. |
dds3245 |
Durability MMF store leaks unregistrations on injection of persistent data To make sure that instances are not registered by DataWriters that no longer exist (they cannot because the system has been restarted), the durability service will inject a self created unregister message for each unique DataWriter for an instance and that is proper behavior. However, the unregister message needs to be freed after it has been injected and that part is missing. This causes every unregister message on instance injection to leak. Solution: Unregister messages are now freed after re-publishing them. |
dds3266 |
Crash in signal handler due to uninitialized value During code analysis, a possible scenario was detected that could results in a crash in the signal handler if it's thread received a signal before a value was initialized. Solution: The code has been updated to avoid this possible scenario. |
Report ID. | Description |
---|---|
OSPL-14293 |
IsoCpp2 query parameters setter function fixed According to existing documentation, the function template Solution: The function now replaces the old set of parameters by the new set. |
OSPL-14271 |
Calling DDS_end_coherent_changes() without calling DDS_begin_coherent_changes() should NOT lead to an error. In normal cases a call to DDS_end_coherent_changes() is preceded by a call to DDS_begin_coherent_changes(). However, there is nothing to prevent an application to call DDS_end_coherent_changes() without having called DDS_begin_coherent_changes(). In that case PRECONDITION_NOT_MET is returned. As a side effect, an error was also generated that ends up in ospl-error.log. This side effect is undesirable, because the error suggested that the user did something illegal. However, it is legitimate to call DDS_end_coherent_changes() without having called DDS_begin_coherent_changes(), in which case PRECONDITION_NOT_MET is returned. Solution: No error is generated any more. Instead, a warning is generated in ospl-info.log to make the user aware that the call to DDS_end_coherent_changes() did not yield the desired effect. |
Report ID. | Description |
---|---|
OSPL-14000 |
Internal operation to get the actual domain (shm) memory usage on the C API. The actual amount of shared memory used by a Domain federation is useful to monitor for test purposes as well for managing a system. The internal operation to get this information is made public for applications by adding it to the C API. This operation will accept a Domain object that can by retrieved by the DDS_DomainParticipant_lookup_domain operation and will return the current amount of used memory in bytes. Note that this is a non-standard product specific operation. Solution: Operation Signature: DDS_long_long DDS_Domain_get_memory_usage(DDS_Domain domain); |
OSPL-14002 |
Internal operation to get the actual domain (shm) memory usage on the ISOCPP2 API. The actual amount of shared memory used by a Domain federation is useful to monitor for test purposes as well for managing a system. The internal operation to get this information is made public for applications by adding it to the ISOCPP2 API's DomainParticipant interface. This will return the current amount of used memory in bytes. Note that this is a non-standard product specific operation. Solution: uint64_t DDS::DomainParticipant::get_memory_usage(); |
Description | |
---|---|
OSPL-12968 / 00019801 |
DCPS Python API Out of scope entities aren't closed, causing memory leak When a python object holding a DDSEntity loses all references, the underlying DDS entity is not deleted, leaking the resource. In a domain where many entities are created dynamically, but not closed explicitly with the close(), this will eventually result in an Out of Resources error. Solution: Python objects automatically garbage collected when all references to the object are gone. There was code in the DCPS python API that deletes the underlying DDS entity when this occurs, but this never triggered because the were strong references to all DDSEntity python objects held in a persistent dictionary in the dds module. To remedy this, this dictionary was changed to store only weak references, so now the entity is deleted when the python object is garbage collected. There is an important thing for developers using this API to note, now. Before this change, it was possible to create a DDSEntity (typically with a listener) and just let it go out of scope, relying on the listener to just do it's thing, and keeping only a parent entity (like the participant) as the main means to control the lifecycle. This is no longer possible. Python code must keep a reference to a DDSEntity object to keep it active. However, note that DDSEntity still maintains a strong reference to its parent entity, meaning that once a reader or writer reference made, one can let go of a participant, publisher, and/or subscriber reference without having it be garbage collected. Once the last reference to the reader/writer is gone, only then is the entire chain of entities deleted. |
Description | |
---|---|
OSPL-10922 |
Implicit IsoCpp2 Participant, Publisher and Subscriber.
A Participant has to be available before able to create a Topic for example. For many usecases, this is just a default Participant. The API would become simpler when a default Participant is assumed when no one was provided while creating the Topic. The same is applicable for Publishers/DataWriters and Subscribers/DataReaders. Solution: A default (singleton) Participant is created implicitly when dds::core::null is used when creating a Topic. An implicit Publisher is created when dds::core::null was provided when constructing a DataWriter (using the Participant of the given Topic). An implicit Subscriber is created when dds::core::null was provided when constructing a DataReader (using the Participant of the given Topic). |
OSPL-10936 |
Simplify creation of transient reliable DataReaders and DataWriters in IsoCpp2.
To create transient reliable DataReaders and DataWriters you have to create the right QoSses with the proper policies and a transient reliable Topic. Many usecases need a simple transient reliable communication. By simplifying the creation of transient reliable Entities, these usecases can be simplified as well. Solution: The org::opensplice::topic::qos::TransientReliable() convenience function is added to IsoCpp2. Topics created with the resulting QoS will be transient reliable. DataReaders and DataWriters created using such a Topic are transient reliable as well automatically. |
Report ID. | Description |
---|---|
OSPL-9914 |
Removal of obsolete ISOCPP (version 1) from examples and documentation ISOCPP version 1 is deprecated and replaced by ISOCPP version 2. Examples and documentation were still present in OpenSplice causing new users to possible make the mistake to start using the wrong ISOCPP version. Solution: The ISOCPP version 1 examples and documentation are now removed forcing new users to use ISOCPP version 2. |
OSPL-9763 / 17591 |
IsoCpp2 memory leak when deleting DomainParticipants. A small listener object was not deleted when the related IsoCpp2 DomainParticipant was deleted. Solution: The deletion of the DomainParticipant and the related listener object and thread has been re-factored. |
OSPL-9187 / 16857 OSPL-9221 |
Potential memory leak for unaccessed group coherent subscribers In order to correctly deliver coherent sets to group coherent subscribers with late joining readers, the middleware maintained the transactions that weren't yet accessed. When accessing the subscriber, the transactions were flushed. This was possible because between begin- and end-access no readers can be added to a subscriber. This however introduced a requirement to access the subscriber periodically in order to be able to reclaim the memory. Solution: A restriction has been put on when datareaders can be created for a group-coherent subscriber. A group-coherent subscriber is always created disabled, regardless of the EntityFactory QoS on the domainparticipant. Readers can only be added to the subscriber for as long as it is not enabled. After the subscriber has been enabled by calling enable on it, mutations to the subscriber like adding a reader to it or changing the QoS (even ones that are normally mutable) on the subscriber or its readers are not allowed anymore. Readers can still be removed from the subscriber. If the subscriber is enabled, all its contained readers will be enabled too, regardless of the EntityFactory QoS of the subscriber. This change thus requires the addition of an explicit invocation of enable on the group coherent subscriber. |
Report ID. | Description |
---|---|
OSPL-9654 / 17532 |
Missing function prototypes on C APIs. When using the missing-prototypes compiler warning flag, the generated code for SAC, C99 and FACE will trigger that compile warning. Solution: Add prototypes just before the related generated function definitions. |
Report ID. | Description |
---|---|
OSPL-6636 |
Invalid return code of offered- and requested-deadline-missed get status operations in classic C++ language binding The DataWriter::get_offered_deadline_missed_status and DataReader::get_requested_deadline_missed_status return an error code if the total count is 0. The status output-parameter is still filled correctly. Because there's no last_instance_handle when no deadline is missed yet, the language-binding would incorrectly determine the instance handle is invalid causing the error retcode. Solution: The check was fixed to allow for this special case. |
OSPL-6647 |
Wrong status reset for on_data_available and on_data_on_readers When an on_data_available event is handled in the DCPS API, it should reset the data_available state before calling the listener. This is done, but it is done on the entity associated with the listener instead of the entity where the event originates from. This means for instance, that when a DomainParticipant receives an on_data_available event, it tries to reset the data_available state of itself instead of the related reader. The Subscriber has the same problem. In the DataReader, the entity is the source itself, so it isn't a problem there. The same happens when a on_data_on_readers event is received. But this has an additional problem that it resets the data_available status, while it should reset the on_data_on_readers status of the related subscriber. This problem exists on all DCPS language bindings. Solution: Fixed behaviour of on_data_available and on_data_on_readers to reset the status of the entity where the event originated. Also fixed on_data_on_readers so that it resets the correct on_data_on_readers status and not the on_data_available status. This was done for all API's. |
OSPL-7343 |
Topics can be created with invalid names. Topics can be created with invalid characters in their names. However, the subsequent creation of DataWriters or DataReaders will fail when using such a Topic that has an invalid name. A topic name can consist out of the following characters: ‘a’-‘z’, ‘A’-‘Z’, ‘0’-‘9’ and ‘_’, but it may not start with a digit. Solution: The Topic creation fails when a name with invalid characters is used. |
OSPL-7560 / 15776 |
When the TimeBasedFilterQos policy is used and the writer stops
publishing after having published a series of samples, readers do
not receive the latest state that was published after the
minimum_separation delay has passed. The TimeBasedFilterQosPolicy can be used to control the rate at which readers receive samples. In particular, if there is a high frequency writer and receiving applications cannot keep up, then the TimeBasedFilterQosPolicy can be used to reduce the rate at which application readers receive the samples. In the previous implementation readers would not receive the latest state that was published in case the writer stops publishing after the minimum_separation delay has passed. Solution: Readers now receive the latest state that was published after the minimum_separation delay has passed if the instance has changed in the meanwhile and the reader's ReliabilityQosPolicy is set to RELIABLE. |
Report ID. | Description |
---|---|
TSTTOOL-396 |
Query condition support in Python Scripting Engine The Python scripting engine now supports query conditions. |
TSTTOOL-397 |
Status condition support in Python Scripting Engine The Python scripting engine now supports status conditions. |
Report ID. | Description |
---|---|
OSPL-7688 |
Query interface supports non-ISO compliant '!=' operator again. In a previous release the non-ISO compliant operator '!=' was removed from the supported SQL syntax. This was done in an effort to provide one common and standards compliant SQL parser for all the various services/languages in the product (where previously different components could use different parsers that supported different syntaxes). However, although the operator in question was non-ISO compliant and never advertised in our documentation, many users already used in in their applications. Solution: The '!=' operator has been added to the supported SQL syntax of our SQL parser. |
Report ID. | Description |
---|---|
OSPL-7688 |
Waiting for historical data with condition was not available for
IsoCpp2. The proprietary function wait_for_historical_data_w_condition() was not implemented on the AnyDataReaderDelegate. Solution: The function wait_for_historical_data_w_condition() has been introduced on the AnyDataReaderDelegate. |
OSPL-7872 |
Java5 DDS API lack support for conditional
wait_for_historical_data operation on DataReader Java5 DDS API lack support for conditional wait_for_historical_data operation on DataReader. Solution: The proprietary conditional waiting for historical data has been implemented and can be used by casting the DataReader to the OpenSplice-specific org.opensplice.dds.sub.DataReader and invoking waitForHistoricalData() method with the desired parameters. |
Report ID. | Description |
---|---|
OSPL-7676 / 15818 |
Several issues with new ISOCPP2 backend for idlpp. The new backend for ISOCPP2 was not working correctly for anonymous and inner types, and generated code had include dependencies on classes that were part of the classic IDL-C++ language mapping. Solution: Anonymous and inner types are now fully supported and all depencencies on the classic C++ language mapping have been removed. |
Report ID. | Description |
---|---|
OSPL-7572 / 15788 |
ISOCPP2 not using C++11 compliant idlpp backend. The C++11 Backend used by idlpp was not fully compliant with the new IDL-C++11 mapping as mandated by the OMG. Solution: A new C++11 backend has been provided for the isocpp2/isoc++2 target of idlpp that is truly complying with the OMG mandated IDL to C++11 language mapping. This results to a number of relevant and incompatible changes with respect to the previous C++11 backend:
|
Report ID. | Description |
---|---|
OSPL-6024 / 14394 |
DCPS Java5 API provides incorrect default LatencyBudget QosPolicy and DataWriterQos.reliability QosPolicy
The DCPS Java5 API uses infinite LatencyBudget as default instead of zero duration and BEST_EFFORT reliability for DataWriterQos reliability instead of RELIABLE. Solution: Default LatencyBudget is now zero and default DataWriterQos reliability is now RELIABLE. |
OSPL-7528 |
WaitSet::wait() and the WaitSet::dispatch() behave different on
time-out Make sure that both the WaitSet::wait() and the WaitSet::dispatch() behave identical, so let the WaitSet::wait() also throw a TimeoutException when the timeout occurs. Solution: WaitSet::dispatch() and WaitSet::wait() would differ in how they responded to a timeout: the first would throw a TimeoutError, the second would not. That behavior has been made more consistent now: the WaitSet::wait() also throws a TimeoutError when the indicated timeout elapses. |
Report ID. | Description |
---|---|
OSPL-6024 / 14394 |
Ignore outstanding loans when deleting entities
A datareader keeps history about open loans. Open loans are memory given to the user by a take or a read action on a datareader and are not (yet) returned with a call to return_loan. A datareader cannot be deleted if there are open loans. The delete_datareader operation will return a RETCODE_PRECONDITION_NOT_MET returncode. There is a need to support deletion of a DataReader when open loans still exist though. Solution: To be able to ignore any open loans in case of a deletion of a datareader a new property "ignoreLoansOnDeletion" has been introduced on the datareader, with a default value false. If this property is set to true using the set_property operation, the datareader will ignore its open loans in case of a delete action. The value is interpreted as a boolean (i.e., value must be either 'true' or 'false') "false" (default). In case the property is set to false, the datareader will check for open loans in case of a delete action and will return PRECONDITION_NOT_MET in case of open loans. In case the property is set to "true", the datareader will ignore open loans and can be deleted in such a situation. This new property is only available on classic Java and C++ API's, both standalone and in CORBA-cohabitation mode. |
OSPL-6342 |
Support for dynamic network partitions for RTNetworking The RTNetworking service needs to support for changing network partitions dynamically at run-time. Solution:This feature allows users to amend the configuration of the RTNetworking service at runtime by means of a Topic-API. |
OSPL-7021 |
The behavior of the dispose_all operation is different from the behavior of the dispose
operation for reader on the same federation. The dispose_all operation is performed asynchronously. This may cause that readers on the same federation my still read alive samples after the execution of the dispose_all operation. With this respect the dispose_all operation behaves differently from that of the dispose operation. The dispose operation has immediate effect for readers on the same federation. Solution: For readers on the same federation the dispose_all operation is performed synchronously. The state of the readers connected ot the same federation is updated during the dispose_all operation. |
OSPL-7027 / 14930 |
100% CPU Usage network service when resending locally rejected samples When a local datareader uses resource limits and runs into these resource limits, the networking service may go to 100% load when attempting to re-deliver a sample that is rejected due to the fact that the maximum resource limits have been reached. Solution: The networking service now reschedules the delivery of a sample to a datareader for the next resolution tick instead of attempting the re-deliver in a busy-wait loop until it is delivered. |
Report ID. | Description |
---|---|
OSPL-6034 / 14399 |
Adding support to timed out asyncronous calls in OpenSpliceRMI-Java
OpenSpliceRMI waits for asynchronous calls replies infinitely by default. The setTimeout operation on the service proxy does not apply on asynchronous calls. So there is no way to control the time OpenSpliceRMI should wait for asynchronous replies. Solution:Henceforth, synchronous and asynchronous calls are waited for during the same default duration whose value is 10 min. A new method has been added to the base asynchronous reply handler defined as "public void handleException(org.opensplice.DDS_RMI.SystemException exp)" to notify the user of timeout expiration (TIMEOUT exception) or of any other error occurred during the asynchronous reply delivery. The user reply handler should override this method to handle the errors. Asynchronous replies timeout could be set either with the CRuntime object, or with the proxy class, or with the reply handler class using the "setRepliesTimeout" method. The existing setTimeout() operation on the proxy class has been deprecated and replaced by setRepliesTimeout(). A new command line option called "--RMIClientSchedulingModel = |
Report ID. | Description |
---|---|
OSPL-5899-1 |
Listener and waitset trigger order
In the previous releases before V6.5 listeners and waitsets where triggered in random order but the specification prescribes that listeners are triggered before waitsets. Solution: In V6.5 we have partly corrected the behavior, as it now behaves deterministic in the following order:
Important remark for existing applications: Previously if a listener is set on a status condition of an entity and the same status condition is attached to a waitset then both the listener would be invoked and the waitset would return the status condition. In V6.5 the waitset will no longer return the status condition. The listener will reset the status condition before the waitset is triggered implying that the waitset will not return the status condition because it no longer evaluates true. Additionally, if an application has not set a listener but accidentally has set the listener mask, the waitset will no longer return the status condtion. It appears that only a waitset is used on a status condition but in reality a default listener is set which will reset the status condition before the waitset is triggered. |
OSPL-5899-2 |
Listener mask behaviour change
According to the DDS specification setting a listener mask for specific events without implementing a listener for those events (i.e. specifying a 'NULL listener') will act as if a listener is set and as a result will consume the events and not propagate events to listeners of its parents or waitsets. Before V6.5 setting a mask had no effect when not implementing a listener meaning that events where not consumed so parent listeners and waitsets where still triggered. Solution: The issue is fixed, but applications must be aware that if the mask was set by accident, they might be missing events they have received before if listeners are set on parent entities or associated conditions are used in waitsets. |
Report ID. | Description |
---|---|
OSPL-4630/ 12963 |
get_all_data_disposed_topic_status() method is now implemented Solution: The get_all_data_disposed_topic_status() method has been implemented in C++ and Java language bindings. |
Report ID. | Description |
---|---|
OSPL-2733 |
Change in QoS settings of Streams API The QoS settings of the internal DataWriters of the Streams API allowed potentially unlimited memory usage, because the resource limits were not set. Solution: The internal resource-limit QoS settings of a StreamDataWriter have been set to max_samples = 10, to prevent potential unlimited memory usage. |
Report ID. | Description |
---|---|
OSPL-505/OSPL-704 |
OSPL_EXIT_CODEs are not useful in a shell. When using ospl two return values OSPL_EXIT_CODE_RECOVERABLE_ERROR (-1) and OSPL_EXIT_CODE_UNRECOVERABLE_ERROR (-2) are indistinguishable in a shell. Solution: The return codes are now postive +1 and +2 |
OSPL-1078/ 10788 |
Support for parallel demarshalling should be available on Java an C++ Demarshalling data to a language-binding specific format may take considerable processing depending on the type of the data. Since demarshalling is done by the application thread performing the read/take, demarshalling occurs single-threadedly, limiting the throughput that can be achieved. Solution: The C++ and Java datareaders have been extended with support for demarshalling with multiple threads. The number of threads used can be controlled by using the new set_property(DDS::Property) operation on C++ and Java datareaders. The property "parallelReadThreadCount" accepts a string representing a positive integer (e.g., "4") as value. If the call was successful, successive read/take operations on that datareader will use the provided number of threads for the demarshalling step of the respective operations. dr.set_property(new Property("parallelReadThreadCount", "4")); |
OSPL-1194 |
Record and Replay: Removal of mode attribute The 'mode' attribute of RnR XML storages has been removed because it was not clear how this setting should be used in different circumstances with new and/or existing storages. Solution: The default behaviour is now to always append to a storage, when its resources (i.e. XML files) exist. If the resources do not exist they will be silently created, if possible. The overwrite-behaviour that was achieved by setting the mode to 'w', has been replaced by a separate TRUNCATE command. This command will remove all data from a storage, but it can only be processed if the storage is not open for recording and/or replaying. |
OSPL-1348 |
Record and Replay: Removal of interest Solution: The RnR command API has been modified to allow removal of individual record/replay interest instead of forcibly removing all interest by stopping the scenario responsible for adding it. The API contains separate commands to either add or remove interest to record or replay data belonging to a specific partition/topic. For example, to stop recording data, a REMOVE_RECORD command must be issued, containing interest expressions that match exactly the expressions of the ADD_RECORD command that initiated the recording. Currently all interest will still be forcibly removed when a STOP_SCENARIO command is published but future releases will only allow interest to be removed by publishing remove commands. |
OSPL-1640 |
Default for ospl changed Inexperienced users finding ospl hard to use as the default isn't friendly. Solution: ospl used to stop by default. It now displays the usage. |
OSPL-1790 |
Added missing functions to DataReaderView in the Standalone C++ language binding Applications built on the Corba C++ language binding couldn't be built on the Standalone C++ language binding due to missing functions in the DataReaderView.
Solution: The DataReaderView in the Standalone C++ language binding
is extended with the missing functions.
|
OSPL-2019 |
Function calls on deleted entities in ccpp and sacpp now correctly return already_deleted Calling functions on deleted entities in ccpp and sacpp returned BAD_PARAMETER instead of ALREADY_DELETED. Solution: Code fixed to match OMG specification. |
Report ID. | Description |
---|---|
OSPL-347/ 8919 |
Changes in Corba Co-habitation Java PSM Solution: In previous releases only a subset of the OpenSplice API was generated using a Corba ORB. Starting from this release, the entire Corba-Java API is generated by the IDL compiler that comes with the ORB. This enables customers to use internal types like ReturnCode_t and InstanceHandle_t in a Corba environment, i.e. Helper and Holder classes are now available for those types as well. Because all classes are now properly implementing Corba interfaces and inheriting from Corba base classes, derived classes are required to provide an implementation for all abstract operations mentioned in the generated interface. In case of Listeners, OpenSplice supplies the ListenerBase class as a convenience. When Listeners extend from this base class they are no longer required to implement these operations themselves. See detailed notes in 6.2 features section above |
OSPL-1160 |
DDS_Service::register_interface C++ operation signature change The DDS_Service::register_interface C++ operation accepts a raw pointer to the service implementation as parameter, whereas its implementation assigns it to a smart pointer. This situation can lead to a double free problem. Solution: The register_interface current signature is deprecated : template and adding a new one : template that accepts a smart pointer to the service implementation as paramter in stead of a raw one. |
Report ID. | Description |
---|---|
9963 / dds3426 dds3478 |
DDSI2 used incorrect encoding for fragment data message headers DDSI2 incorrectly used to generate and interpret fragmented data message headers as if they were slightly extended versions of the non-fragmented data message headers. This caused DDSI2 to be non-compliant with respect to the standard and to fail to interoperate with other vendors' implementations for large samples. Solution: The setting and interpretation has been corrected. This breaks backwards compatibility. For those exceptional cases where backwards compatibility is currently an issue, a setting Unsupported/LegacyFragmentation has been introduced, which may be set to true to continue using and interpreting the old message format. |
Report ID. | Description |
---|---|
dds751 |
Change in ReaderLifecycle QoS (invalid sample setting) The QoS setting that determines if OpenSplice creates invalid samples to communicate state changes, called 'enable_invalid_samples', is now deprecated. Invalid samples, in the past, could either be enabled or disabled. There is a replacement QoS setting, called 'invalid_sample_visibility', which accepts three values: - MINIMUM_INVALID_SAMPLES: Acts like the old enable_invalid_samples = true, an invalid sample will be created if there's no regular sample on which the state change can be piggy-backed (this is the default behavior). - NO_INVALID_SAMPLES: Acts like the old enable_invalid_samples = false, no invalid samples are created. - ALL_INVALID_SAMPLES: Currently not implemented, but in the future will create an invalid samples for all state changes even if a regular sample is available. Using the QoS deprecated setting will cause a message to be logged to the info log. This QoS will be removed from the product in the future. If applicable it is recommended to switch application code to the new invalid_sample_visibility setting instead. Solution: The default DataWriter reliability QoS is now set to RELIABLE in OpenSplice's API. |
dds2676 |
Default DataWriter reliability QoS is set to BEST_EFFORT OpenSplice default DataWriter reliability QoS was set to BEST_EFFORT. This was conflicting with the spec which states that default DataWriter reliability QoS should be RELIABLE. Solution: The default DataWriter reliability QoS is now set to RELIABLE in OpenSplice's API. |
dds2806 |
Domain ID change from string to integer The DDS API prescribes a create_participant() method to create a DomainParticipant. A DomainParticipant is the entry-point for an application to a specific Domain. To be able to create a DomainParticipant for a specific Domain, the create_participant() call requires a DomainId_t as parameter. In the specification the type of this DomainId_t is 'native' meaning that every vendor is free to choose its own type. In OpenSpliceDDS the type of a DomainId_t was a string and represented either the name of the Domain or a URI that points to the location of the configuration for the Domain to connect too. This has now become an integer domain ID. With this change OpenSpliceDDS now complies with other vendors on the DomainId_t type and the DDS interoperability specification (DDS-I) can use it now to select the port-number for discovery for a specific Domain. Solution: The feature is now implemented and described in the OpenSpliceDDS Reference Manuals. |
dds3061 |
DDSI v1 removed from the codebase The deprecated DDSIv1 code is now officially removed from the source code base and product APIs. DDSI2 should be used. Solution: The feature is now implemented and described in the OpenSpliceDDS Reference Manuals. |