Abstract:The construction of test scenarios is crucial to evaluate the operational risks of autonomous vehicles (AVs) on the road. By collecting California autonomous vehicle collision report for 2019 to 2021, First, the crash sequence was described as a combination of initial stage, triggering event, response action and crash stage, and the sequences were extracted from crash report to form the basic test scenarios. Second, the dissimilarity between sequences was measured by sequence alignment methods. Finally, nine types of sce- nario groups and their typical scenarios were obtained using cluster analysis methods. Frequency analysis shows that the most represent- ative crash pattern is when an AV stops and is contacted by a vehicle traveling directly behind it. Cross-tabulation analysis show that scenario groups were significantly associated with variables measuring crash outcomes and describing the road environment. The re- search methods provide new ideas for AV crash scenarios extraction, and the research results have certain reference value for under- standing AV crash pattern.