Table of Contents

What does software integration mean?

Definition

Software integration is a development process in which separate software systems—applications and components—are connected so they work together to form a new, unified system.

Phases

1.) Requirements assessment and planning

2.) Requirements analysis and specification

3.) Development and implementation

4.) Testing and validation

5.) Maintenance and support


Legacy Systems

Definition The term legacy system refers to IT systems that use older (possibly obsolete) technologies but are still actively operating and play an essential role in an organisation's everyday operation.

Why use legacy systems?

Why Are They Not Replaced?

Solutions


Overview of Integration Strategies

Point to Point connection

Components connect directly to each other, typically via file transfer or direct database access. There is no intermediary layer, therefore communication is fast. Initially, it is easy to implement.

flowchart LR %% Nodes R[Radiology] EMR[EMR] CDB[Central Database] PS[Patient Search] PDB[Patient DB] ER[Emergency Dept.] FIN[Billing / Finance] PHARM[Pharmacy] %% Layout helpers (optional) %% Try to mimic the original positions by grouping subgraph Left[ ] direction TB R PS FIN end subgraph Middle[ ] direction TB EMR PDB PHARM end subgraph Right[ ] direction TB CDB ER end %% Connections (based on the diagram) R <--> PS PS <--> EMR PS --> PDB PS --> FIN EMR --> PDB EMR --> FIN EMR <--> CDB PDB <--> ER PDB <--> PHARM ER --> PHARM PHARM --> ER CDB --> ER ER --> CDB

Disadvantages – Challenges


Middleware Integration

Components do not connect to each other directly; instead, they communicate through a central intermediary (e.g., API Gateway, Application Server, Enterprise Service Bus – ESB).

The intermediary layer handles different communication protocols.

Disadvantages – Challenges

flowchart TB EMR[EMR] RAD[Radiology] FIN[Billing and Finance] ER[Emergency Department] MW[Message Oriented Middleware] PS[Patient Search] PDB[Patient Database] CDB[Central Database] PHARM[Pharmacy] EMR --> MW RAD --> MW FIN --> MW ER --> MW MW --> PS MW --> PDB MW --> CDB MW --> PHARM


Message Queue-Based Integration

Components do not connect to each other directly; instead, they communicate via message queues.

Messages are processed asynchronously.

Disadvantages – Challenges

flowchart TB %% Top layer systems RAD[Radiology] EMR[EMR] FIN[Billing] ER[Emergency Department] %% Message Queues Q1[[Queue]] Q2[[Queue]] Q3[[Queue]] %% Bottom layer systems PS[Patient Search] PDB[Patient Database] CDB[Central Database] PHARM[Pharmacy] %% Top -> Queues RAD --> Q1 EMR --> Q1 FIN --> Q2 ER --> Q3 %% Queue interconnection Q2 <--> Q3 %% Queues -> Bottom Q1 --> PS Q2 --> PDB Q3 --> CDB Q3 --> PHARM


Data Sharing

A simple approach to integration is data sharing. Data sharing–based integration aims to transfer and share data between systems. This enables individual systems to access and utilize data stored in other systems.

Data sharing can take several forms:

Comparison of Data Sharing Approaches

File-Based Data Sharing

The most fundamental method of data sharing. One application writes data, while another application reads data from the same file. The data files are stored in a central location — such as a shared folder (e.g., NFS) or an (S)FTP server. The information flow is unidirectional: A → B.

Data Encoding

Most file-based integration approaches use text-based files.

The most common formats are:

Raw text formats may use:

For variable-length records, a delimiter is required to separate data fields. The most widely known method is CSV (Comma-Separated Values).

flowchart LR A[System A] STORAGE[[Shared Folder\nFTP Server]] B[System B] A -- writes --> STORAGE STORAGE -- reads --> B

File-Based Integration with Lock Mechanism

State Files

State files can be used to track the processing status of data files.

These files may contain the current processing state, such as:

File-Based Integration with Lock Mechanism

State Files

State files can be used to track the processing status of data files.

These files may contain the current processing state, such as:

flowchart LR A[System A] STORAGE[[Shared Folder / FTP Server]] LOCK[(data.lock)] B[System B] A -- "1) create lock" --> LOCK A -- "2) write data" --> STORAGE B -- "3) detect lock" --> LOCK B -- "waits" --> STORAGE A -- "4) remove lock" --> LOCK B -- "5) read data" --> STORAGE

Lock File Mechanism

1) Lock file creation: System A begins processing a data file and creates a lock file, for example: data.lock.

2) Writing phase: System A creates or writes the data file while data.lock exists. System B attempts to access the data file but detects that the lock file exists, therefore it waits.

3) Completion: System A finishes processing and removes the data.lock file.

4) Reading phase: System B detects the data.lock file has been removed, and it can begin its own processing.

Purpose of the Lock Mechanism

This method ensures that only one system processes the data file at a time, preventing data conflicts and inconsistencies.

The use of lock files is a simple and effective technique for process synchronization and coordination in file-based integration.

Limitations of File-Based Integration

This method remains widely used today, but it has several significant disadvantages:


Database-Based Data Sharing

Database-based integration is a method that enables data sharing and synchronization between different systems directly through databases.

In this approach, multiple applications and systems use either:

to access and manage data.

flowchart LR A[Application A] B[Application B] DB[(Shared Database)] A <--> DB B <--> DB


The same system with db replication:

flowchart LR A[Application A] C[Application C] DB1[(Primary Database)] B1[Application B] B2[Application B] DB2[(Replica Database)] A <--> DB1 C <--> DB1 DB1 -- replication --> DB2 B1 <--> DB2 B2 <--> DB2

Example

E-commerce platform and Warehouse Management System: The e-commerce platform can be directly integrated with the warehouse database to provide real-time inventory information.

Similarities to File-Based Integration

Limitations

When to Use Database-Based Integration?

Database-based integration is appropriate in the following scenarios:

However, it may NOT be the best choice:

Integration Strategy Comparison

Aspect File-Based Integration Database-Based Integration Message Queue-Based Integration
Coupling Tight coupling (shared file format) Tight to medium coupling (shared schema) Loose coupling
Communication Style Batch, unidirectional Data-level sharing Asynchronous message exchange
Real-Time Capability No Not by default Yes (naturally asynchronous)
Scalability Limited Moderate High
Monitoring Difficult Database-level monitoring Built-in queue monitoring (DLQ, metrics)
Complexity Low initial complexity Medium High
Transaction Support No native support Strong ACID support Depends on message broker
Typical Use Case Periodic data exchange Shared enterprise systems Distributed / cloud-native systems
Interface Definition File format agreement Shared database schema Message contract / schema definition
Cloud-Native Suitability Low Medium High