DataSIR: ABenchmark Dataset for Sensitive Information Recognition

Neural Information Processing Systems 

A.1 Comparison of Results for Gemini with Different Format Transformations Gemini attained optimal performance metrics for sensitive category and format transformation scenarios tasks, surpassing all comparator models in maximum achievable performance. The focus was then placed on Gemini's ability to recognize and restore both original and transformed data. The experimental results are shown in Table 1. In the main text section Experiments, due to space constraints, only four key observations were analyzed, as follows: i) The LRAcc and DRAcc of total format transformed data is less than original data, which indicates that it is more difficult to recognize and restore data after format transformed. These transformations only affect numbers, and only the IMEI and IMSI (purely numeric) sensitive categories support such transformations. Due to the lack of contextual information in the sample data, large language models may confuse these with personal identifiers, mobile numbers, and MEID.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found