In many countries it is mandatory for food labelling to include standardised nutritional information. There are many health apps to allow people to track their intake of key nutritional metrics.
There are also good reasons for foods to be clearly marked with allergens, and also information about whether it is suitable for vegetarians.
There is, however, very little in terms of reliable machine processable data on food. Many crowd sourced data exists, but is incomplete and inaccurate. There are also many cases, such a made to order Subway sandwiches, and so on, where the data needs to be customised, and ideally included on the receipt.
This document proposes a system for encoding the basic nutritional information (not the full ingredients list) in a QR code.
This includes means to identify main allergen groups, and other attributes such as whether the item is suitable for vegetarians.
The proposal allows for extensions, allowing additional values, flags, etc, to be added in a backwards compatible way.
A compact 2D bar code providing key data for health, allergens, and key preferences (e.g. vegetarian). The design considers the alphabet used in QR codes, and a compact format, but also aims to make a coding that is not impossible to be read by humans (when decoded from the QR code) if necessary.
The main concept is to allow mobile apps to read data on products, and provide that data to the user.
The QR code (IEC18004) standard has a number of alphabets that can be used. The main alphanumeric alphabet uses a set of characters that includes upper case letters A-Z, digits 0-9,and symbols $, %, *, +, -, ., /, :, and space.
The design of this format makes use of this restricted character set so as to allow more compact encoding. However, readers should accept and parse lower case letters as equivalent to their upper case counter parts.
The final part of the coding allows for a product description string. It is recommended that the same restricted character set be used, but more extensive characters and even UTF-8 characters can be used if desired. To minimise the size of the QR code, the encoding should make use of the most efficient combination of encoding alphabets and switch during the code as necessary - this one reason is why the description is the last element in the format.
This is version 1.0. Simply adding new fields does not justify an increase in the major version number. Readers should ignore unknown fields in most cases (except allergens).
The encoding consists of a header and version following by a number of fields. The fields can be of various types:
The numeric data fields, and information fields, can internally be in an order, but they must all come before any allergens or flags. This is simply because the parsing would not work for an allergen or flag to come immediately before a numeric data field or information field. The description is always the last field, simply because it allows any characters within it.
The scale field applies to following numeric data fields, and so the placement of numeric data fields relative to the scale field is important.
The format starts with NUQR: followed by the two digit country code for the data, and digits for the major version, in this case 1 for version one as described here.
The country code is used to indicate the reference definitions for the values, so where there is a harmonised region such as the EU, the country code can be simply EU. This is not the country of origin of the product, it is the country / region for the reference definitions.
The default unit for mass is grammes (g) and for energy is kilo Joules (kJ). Whilst numeric data fields can include decimal points it is recommended that scale is defined. This is done by a scale header. The scale header can appear anywhere, and indicates the scale from that point on. The scale applies to numeric data fields, whether mass, energy or some other unit.
The format is * following by a scale. Each numeric value header that follows is divided by this scale. E.g. *10 before mass units means mass is then in 0.1g units, so 123 would mean 12.3g.
The scale applies to all numeric value units that follow, until replaced with a new scale header. The new scale header does not compound, it replaces the previous scale header, so *1 returns to single units.
It is recommended that fields fields not requiring a scale are included first, then the scale, then those that need the scale. E.g. J and W before a scale, and F, C, etc, after the scale, as per the example shown later. Applying a scale before J and W are valid but not expected to be needed and may simply cause confusion. It is also recommended that scales 1, 10, 100 are used and not obscure values like 3.
The numeric data fields consist of one (or more) letters followed by one (or more) digits. Only single letters are defined so far, but two (or more) letter fields could be added in the future.
The letter(s) represent the field type. The digits represent the value of the field. The digits can include decimal points, but it is recommends that the units setting is defined instead.
In addition, the digits can, in some cases, be followed by / and a secondary value.
Apart from W packet/portion size, all Mass values are g per 100g (i.e. a percentage) but are subject to the scale setting. All energy values are in kJ per 100g, but are subject to the scale setting. Obviously applications that use this data can present data per pack and per portion as necessary. If an application wishes to show data in other units such as Calories, or oz, that is a conversion to be done on the application - there is no variant format for different base units in the QR code format.
Code | Primary value | Secondary value |
---|---|---|
G | Mass of entire pack in grammes. | Optional portion size in grammes. |
J | Energy in kJ per 100g | |
F | Fat | Saturated |
C | Carbohydrate | Sugars |
B | Fibre | |
P | Protein | |
S | Salt |
A special case of a value that is just a . is valid to indicate a trace amount. Where there is zero present the field may simply be omitted.
Unknown fields should be ignored
In addition to the numeric fields as above, information fields in the same format are defined.
Code | Value |
---|---|
I | Identity (UPC/EAN bar code number) |
Unknown fields should be ignored
Allergens should be included, if necessary. These can be in any order. Each allergen is given an alphabetic code of one or more characters. Each is included prefixed with $.
Unlike data fields, and unknown allergen should be flagged to the user as a unknown allergen code and advise that the user reads the label.
Code | Meaning |
---|---|
G | Cereals containing gluten, namely: wheat (such as spelt and Khorasan wheat), rye, barley, oats |
C | Crustaceans for example prawns, crabs, lobster, crayfish |
E | Eggs |
F | Fish |
P | Peanuts |
S | Soybeans |
M | Milk |
N | Nuts; namely almonds, hazelnuts, walnuts, cashews, pecan nuts, Brazil nuts, pistachio nuts, macadamia (or Queensland) nuts |
CE | Celery (including celeriac) |
MS | Mustard |
SM | Sesame |
SD | Sulphur dioxide/sulphites, where added and at a level above 10mg/kg in the finished product. This can be used as a preservative in dried fruit |
LU | Lupin which includes lupin seeds and flour and can be found in types of bread, pastries and pasta |
MO | Molluscs like clams, mussels, whelks, oysters, snails and squid |
X | Generic unspecified allergen - READ THE LABEL! This is used where a code has yet to be allocated for some new allergen. |
Additional flags can be added, in any order. These indicate some attribute of the product. They consist of either + or - followed by letters. The use of + asserts the flag is true, and - asserts it is not true. Absence of a flag makes no assertion.
For example +V means suitable for vegetarians but -V means not suitable for vegetarians. If there is neither than the QR code does not stipulate if the product is suitable for vegetarians, or not.
Code | Meaning |
---|---|
V | Suitable for vegetarians |
VG | Suitable for vegans |
K | Kosher |
KP | Kosher for Passover |
H | Halāl |
Unknown fields may be ignored.
The table of allergens and flags do not overlap. The allergen codes can be used with + to indicate may contain i.e. where processed in the same factory, or with - to assert the allergen is definitely not present, e.g. -G to assert definitely gluten free.
The final field is the description. The format is % followed by manufacturer, then / followed by description.
It is recommended that these fields use the same restricted alphanumeric upper case character set, but mixed case and UTF-8 can be used if desired. The QR code shoudl be efficiently encoded which changes of alphabet as necessary.
The header must be present, and this is followed by digits for the version. After this the fields are parsed in different ways depending on the first character.
The content is broken in to fields using the following rules, this allows for expansion of the specification.
This is an example with no fibre, and trace fat content.
NUQR:EU1 | Header, version 1, EU reference |
I8722700166221 | EAN barcode 8722700166221 |
J1100 | 1100kJ per 100g |
W400/8 | 400g jar, 8g portions |
*10 | Units are 0.1g |
F. | Fat - trace amount |
C300/12 | Carbohydrates 30.0g, of which sugars 1.2g, per 100g |
P340 | Protein 34.0g, per 100g |
S108 | Salt 10.8g, per 100g |
+V | Suitable for vegetarians |
$G | Contains gluten |
$CE | Contains celery |
%UNILEVER/MARMITE | Manufacturer UNILEVER, Product MARMITE |
The entire code is therefore:
NUQR:EU1I8722700166221J1100W400/*10F.C300/12P340S108+V$G$CE%UNILEVER/MARMITE
This encodes to a QR code for example:
NUQR:EU1 | Header, version 1 |
I5000436850816 | EAN barcode 5000436850816 |
J2067 | 2067kJ per 100g |
W200/25 | 200g pack, 25g portions |
*10 | Units are 0.1g |
F289/58 | Fat 28.9g, of which saturates 5.8g, per 100g |
C415/95 | Carbohydrates 41.5g, of which sugars 9.5g, per 100g |
B89 | Fibre 8.9g, per 100g |
P130 | Protein 13.0g, per 100g |
S12 | Salt 1.2g, per 100g |
$N | Contains nuts |
$G | Contains gluten |
$SD | Contains sulpher dioxide |
+V | Suitable for vegegtarians |
%TESCO/BOMBAY MIX | Manufacturer TESCO, product BOMBAY MIX |
The entire code is therefore:
NUQR:EU1I5000436850816J2067W200/25*10F289/58C415/95B89P130S12$N$G$SD+V%TESCO/BOMBAY MIX
This encodes to a QR code for example:
Idea proposed by James Kennard, specification version 1.0 by Adrian Kennard, Dec 2017.