Exploratory Data Analysis on Username-Password Dataset
Vanita Jain, Bharati Vidyapeeth's College of Engineering, INDIA
Mahima Swami, Bharati Vidyapeeth's College of Engineering, INDIA
Rishab Bansal, Bharati Vidyapeeth's College of Engineering, INDIA

Passwords act as a first line of defense against any malicious or unauthorized access to one's personal information. With the increasing digitization, it has now become even more important to choose strong passwords. In this paper, the authors analyze a 100 million Email-Password Database to perform Exploratory Data Analysis. The analysis provides valuable insights on statistics about the most common passwords being used, character set of passwords, most common domains, average length, password strength, frequencies of letters, numbers, symbols (special characters), most common letter, most common number, most common symbol, the ratio of letters, numbers, symbols in passwords which highlights the general trend that users follow while creating passwords. Using the results of this paper, users can make intelligent decisions while creating passwords for themselves, i.e., not opting for the most common features that will help them create robust and less vulnerable passwords.