Soybean seed contains a wide range of secondary metabolite compounds such as isoflavones, phyto-sterols, lecithins and saponins. The secondary metabolites are diverse in chemical structure and property. Therefore, it is not easy to analyze simultaneously the diverse metabolites. We assessed LC-MS profiling analysis to evaluate seed component diversity in 33 soybean cultivars and to identify diverse substances according to their fragmentation patterns. The 33 cultivars were divided clearly into two groups according to PCA of the profile data of seed components. The soluble extracts from hypocotyle as well as cotyledon in Group 1 were characterized by the presence of a compound with 969.5 m/z, while the extracts in Group 2 were characterized by the presence of a compound with 980.6 m/z. The two cultivars Williams 82 and Enrei were selected from each group, and then subjected to further analyses. PMF (Peptide MS Fingerprint) data generated by the Q-TOF analysis and MASCOT database search identified the compounds composed of 37 amino acids as the 4-kDa peptide (Albumin 1b). Substitution of three amino acids was found between the two groups. Three candidate genomic sequences were distributed on soybean genome. Expression analysis by RT-PCR indicated one of the three sequences encodes the 4-kDa peptide and expressed in developing seed. In this study, we confirmed the comprehensive analysis with LC-MS is a powerful tool to elucidate metabolite diversity in plant materials including soybean seed.