理解JPEG文件头的格式 您所在的位置:网站首页 W020190604421199636296.jpg 理解JPEG文件头的格式


2023-10-17 04:36| 来源: 网络整理| 查看: 265


1)why jpeg?






二进制形式打开文件,文件开始字节为FF D8,文件结束两字节为FF D9。则初步判定文件为jpeg。

jpeg的SOI(start of image) 为ff d8,EOD(end of image)为ff d9



The marker 0xFFE0~0xFFEF is named "Application Marker", not necessary for decoding JPEG image. They are used by user application. For example, older olympus/canon/casio/agfa digicams use JFIF(JPEG File Interchange Format) for storing images. JFIF uses APP0(0xFFE0) Marker for inserting digicam configuration data and thumbnail image. Also Exif uses an Application Marker for inserting data, but Exif uses APP1(0xFFE1) Marker to avoid a conflict with JFIF format. Every Exif file formats starts from this format;

大概意思是:老式相机采用JFIF格式,即以FF E0开始,头部含有 .. JFIF...信息。现在Exif更加流行,Exif以FF E1字节开始。



0xFF + Marker Number(1 byte) + Data size(2 bytes) + Data(n bytes)




蓝色框内为JFIF段,长度字节流为00 10 = 16,后面14个字节为内容。4A 46 49 46是JFIF的asci码。后面10个字节不清楚。直接通过JFIF长度跳过即可。



FF E1开始, 16 6F = 24096,约24k的数据,信息量还是相当丰富的, 45 78 69 66 为Exif的asci码流, 00 00 未使用



数据部分采用TIFF格式,TIFF格式参见refer[2],IFD(Image File Directory)

TIFF主要包含IFD0和IFD1两部分,IFD0有两部分组成:IFD的目录部分,以及Link to LFD1的部包(一个32bit的偏移量)。IFD的目录的每个记录指向一个IFD entry,每个IFD内部可能嵌套包含一系列IFD。。IFD1的结构类似。


FFE1APP1 MarkerSSSSAPP1 DataAPP1 Data Size45786966 0000Exif Header49492A00 08000000TIFF HeaderXXXX. . . .IFD0 (main image)DirectoryLLLLLLLLLink to IFD1XXXX. . . .Data area of IFD0XXXX. . . .Exif SubIFDDirectory00000000End of LinkXXXX. . . .Data area of Exif SubIFDXXXX. . . .IFD1(thumbnail image)Directory00000000End of LinkXXXX. . . .Data area of IFD1FFD8XXXX. . . XXXXFFD9Thumbnail image



图-2为TIFF的格式框架图,对应图1中,49 49 2A 00  08 00 00 00 为TIFF的头部,4949表示II小端存储,2A 00 为约定的常量数值,08 00 00 00 为TIFF数据相对TIFF开始位置的offset,头部为8个字节所以这个值一般为8,该offset用4字节表示,理论上可以是文件任何地方。

接下来0A 00 为IFD数据开始位置(也可以通过TIFF的头部开始位置加上offset of 0th IFD),两个字节0A 00,小端模式下为10,即Number of Directory Entries = 10。

每个IFD Entry(索引)有12个字节组成,图-2右子图,


tag[0:2], tag是一些预定的标签type[2:4], type表示数据类型,1是BYTE,2是ASCII,3是SHORT等count[4:8], count是长度value[8:12],当count OFFSET 8 - 1232 tag=271,type=2,count=6,value=134  actual_data = Canon  tag=272,type=2,count=16,value=140  actual_data = Canon EOS 1100D  tag=274,type=3,count=1,value=1 tag=282,type=5,count=1,value=156 tag=283,type=5,count=1,value=164 tag=296,type=3,count=1,value=2 tag=305,type=2,count=27,value=172  actual_data = Adobe Photoshop CS Windows  tag=306,type=2,count=20,value=199  actual_data = 2013:06:05 17:25:16  tag=531,type=3,count=1,value=2 tag=34665,type=4,count=1,value=220 number of directory entries = 6 ==> OFFSET 1232 - 0 tag=259,type=3,count=1,value=6 tag=282,type=5,count=1,value=1310 tag=283,type=5,count=1,value=1318 tag=296,type=3,count=1,value=2 tag=513,type=4,count=1,value=1326 tag=514,type=4,count=1,value=4409 -------------------------------------

网上较全面的解析jpeg Exif的python 代码:


解析jpeg的Exif信息没有问题, 但解析其他部分有错误。注意python版本号。


2. 示例

00: ff d8 ff e1 02 62 45 78 69 66 00 00 4d 4d 00 2a  10: 00 00 00 08 00 08 01 0f 00 02 00 00 00 04 48 54  20: 43 00 01 10 00 02 00 00 00 0a 00 00 00 6e 01 1a  30: 00 05 00 00 00 01 00 00 00 78 01 1b 00 05 00 00  40: 00 01 00 00 00 80 01 28 00 03 00 00 00 01 00 02  50: 00 00 02 13 00 03 00 00 00 01 00 01 00 00 87 69  60: 00 04 00 00 00 01 00 00 00 88 88 25 00 04 00 00  70: 00 01 00 00 01 60 00 00 00 00 44 65 73 69 72 65  80: 20 48 44 00 00 00 00 48 00 00 00 01 00 00 00 48  90: 00 00 00 01 00 0b 88 27 00 03 00 00 00 01 00 88  a0: 00 00 90 00 00 07 00 00 00 04 30 32 32 30 90 03  b0: 00 02 00 00 00 14 00 00 01 12 90 04 00 02 00 00  c0: 00 14 00 00 01 26 91 01 00 07 00 00 00 04 01 02  d0: 03 00 92 0a 00 05 00 00 00 01 00 00 01 3a a0 00  e0: 00 07 00 00 00 04 30 31 30 30 a0 01 00 03 00 00  f0: 00 01 00 01 00 00 a0 02 00 04 00 00 00 01 00 00



00: ff d8 ff e1 02 62 45 78 69 66 00 00  ff d8: SOI (start of image) – JPEG files always start with this

ff e1 Exif头部开始标志,

02 62:Exif数据长度,2*256+6*16+2 = 610

45 78 69 66 :Exif字符串的asci码

00 00 两个字节保留



TIFF Image File Header –8 bytes

0C: 4d 4d 00 2a 00 00 00 08 Bytes 0-1 = 4d 4d = "MM" Big Endian Order" Bytes 2-3 = 00 2a = 42 = Id as TIFF File Bytes 4-7 = 8 = offset in bytes of 0th IFD 


0C + offset 开始读取IFD数据==>

0th IFD (1st IFD) (2 bytes long) 14:  00 08  Bytes 0-1 = Number of entries = 8,一共8个IFD索引。

1st Tag in 0th IFD (12 bytes long) 16: 01 0f 00 02 00 00 00 04 48 54 43 00 Bytes 0-1 (Tag ID) = 01 0f = 271 = Make Bytes 2-3 (Tag Type) = 00 02 = 2 = Ascii Bytes 4-7 (Count) = 00 00 00 04 = 4 = 4 Characters Bytes 8-11 (Value) = 48 54 43 00 = "HTC\0" – tag value is here (not offset)

2nd Tag in 0th IFD (12 bytes long) 22: 01 10 00 02 00 00 00 0a 00 00 00 6e  Bytes 0-1 (Tag ID) = 01 10 = 272 = Model Bytes 2-3 (Tag Type) = 00 02 = 2 = Ascii Bytes 4-7 (Count) = 00 00 00 0a = 10 = 10 Characters Bytes 8-11 (Value) = 00 00 00 6e = Offset = 6e = 110 Bytes - This means the value is an Ascii string (length 10 starting at 122 Bytes = 0x78) 78: 44 65 73 69 72 65 29 48 44 00 = "Desire HD\0"

122的偏移是这么算出来的:0C + 110 = 122 = 0x78

3. refer

1. Exif 文件格式:http://www.media.mit.edu/pia/Research/deepview/exif.html


2. TIFF6的格式标准:http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf


3. Anatomy of a jpg image:http://www.itbrigadeinc.com/post/2012/03/06/Anatomy-of-a-JPG-image.aspx








      CopyRight 2018-2019 实验室设备网 版权所有