ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level

Lin, Xiaojie, Ma, Baihe, Wang, Xu, Yu, Guangsheng, He, Ying, Liu, Ren Ping, Ni, Wei

arXiv.org Artificial Intelligence 

Abstract--As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. The Controller Area Network OBD-II diagnostic data is easy to access via the OBD-II port, (CAN) protocol was firstly developed by Bosch in the as all modern cars are equipped with the OBD-II diagnostic 1980s [1] and serves as the de facto standard protocol for connecting system. OBD-II diagnostic data can be converted into humanreadable ECUs embedded in cars [3]-[5]. The standard structure accurate vehicle data with public formulas to be used of the CAN frame is composed of the start of frame, arbitration in the matching process for associating semantic meanings field, control field, data field, CRC field, ACK field and end with CAN signals. Both OBD-II diagnostic data and regular of frame, as shown in Figure 1. While the CAN protocol has CAN frames can be collected from the OBD-II port. The a standardized frame structure, understanding the protocol's RE systems can leverage both CAN and OBD-II diagnostic utilization for signal transmission remains challenging. This data to create a comprehensive dataset for reverse engineering is because Original Equipment Manufacturers (OEMs) encode purposes, eliminating the need for additional measurement the signals within the CAN frames' data fields (data payloads) equipment like IMUs. in proprietary ways that vary among OEMs, vehicle models, The primary objective of a CAN RE system is to identify the and years [6]. CAN messages frames is the first step to extracting the essential information are structured into frames, and the CAN frames of different to develop autonomous applications or explore automotive CAN IDs have different lengths of the data payload.