Understanding EDI Delimiters: Segments, Elements, and Separators
Every EDI file is just text, but it is text organized by special characters called delimiters. Understanding them is the single most useful thing you can learn for reading EDI, because once you know how a file is divided, the rest is just looking up what each piece means.
The three levels of structure
EDI data is structured at three levels, each marked by its own delimiter:
- Segment terminator - marks the end of a segment (one logical line). After it, a new segment begins.
- Element separator - divides the values (elements) within a segment.
- Component separator - divides sub-values (components) within a single element, for composite data.
Think of it like an address. The segment is the whole address line, the elements are street, city, and postal code, and components are the parts of a single field that has internal structure.
X12 delimiters
In ANSI X12, the common defaults are:
- Segment terminator: tilde
~ - Element separator: asterisk
* - Component separator: colon
:(or sometimes>)
So in PO1*1*100*EA*12.50~, the asterisks separate elements and the tilde ends the segment. Critically, X12 does not fix these characters by rule; it declares them by position in the opening ISA segment. The ISA is a fixed-length, 106-character segment, and the parser reads the element separator from the character right after "ISA", the component separator from a specific late position, and the segment terminator from the very last character. This is why a correct ISA is essential: get its length wrong and the whole file mis-parses.
EDIFACT delimiters
In UN/EDIFACT, the defaults are different:
- Segment terminator: apostrophe
' - Element separator: plus
+ - Component separator: colon
: - Release (escape) character: question mark
? - Decimal mark: period
.
So QTY+21:100' uses a plus to separate elements, a colon to separate the qualifier from the value within the composite, and an apostrophe to end the segment. EDIFACT can declare these explicitly using an optional UNA service string at the very start of the interchange. If present, UNA spells out the six special characters in a fixed order, so a parser can adapt to non-default delimiters.
The release character: EDIFACT's escape hatch
What happens if your data legitimately contains a plus sign or a colon? EDIFACT solves this with the release character, by default ?. Placing it before a delimiter tells the parser to treat that delimiter as literal data. So 10?+2 is the text "10+2", not two elements. X12 has no such mechanism, which is one reason X12 data tends to avoid delimiter characters entirely and why X12 implementations choose delimiters unlikely to appear in real data.
Empty elements and why double delimiters appear
You will often see consecutive delimiters, like ** in X12 or ++ in EDIFACT. These are not errors. They mean an element was intentionally left empty while preserving the position of later elements. Because meaning is positional, a parser must keep counting positions even when some are blank. Reading BEG*00*SA*PO-99831**20240115, the empty value between the PO number and the date is a skipped optional element, not missing data.
Why delimiter knowledge prevents bugs
- If you assume the wrong segment terminator, the entire file collapses into one giant segment.
- If you ignore the component separator, you will read one composite element as if it were several elements, shifting every position after it.
- If you ignore the release character in EDIFACT, escaped delimiters split data incorrectly.
A good viewer detects the delimiters automatically - from the ISA in X12 or the UNA in EDIFACT - so you do not have to guess. That detection is the first thing EdiPeek does when you paste a file.
Paste any X12 or EDIFACT file and EdiPeek auto-detects the delimiters and splits every segment and element correctly, in your browser, with nothing uploaded.
X12 viewer EDIFACT viewer