ASPG Menu
search

American Scientific Publishing Group

verified Journal

Fusion: Practice and Applications

ISSN
Online: 2692-4048 Print: 2770-0070
Frequency

Continuous publication

Publication Model

Open access · Articles freely available online · APC applies after acceptance

Fusion: Practice and Applications
Full Length Article

Volume 4Issue 1PP: 5-14 • 2021

Exploratory Data Analysis on Username-Password Dataset

Vanita Jain 1* ,
Mahima Swami 1 ,
Rishab Bansal 1
1Bharati Vidyapeeth's College of Engineering, INDIA
* Corresponding Author.
Received January 07, 2021 Accepted May 10, 2021

Abstract

Passwords act as a first line of defense against any malicious or unauthorized access to one's personal information. With the increasing digitization, it has now become even more important to choose strong passwords. In this paper, the authors analyze a 100 million Email-Password Database to perform Exploratory Data Analysis. The analysis provides valuable insights on statistics about the most common passwords being used, character set of passwords, most common domains, average length, password strength, frequencies of letters, numbers, symbols (special characters), most common letter, most common number, most common symbol, the ratio of letters, numbers, symbols in passwords which highlights the general trend that users follow while creating passwords. Using the results of this paper, users can make intelligent decisions while creating passwords for themselves, i.e., not opting for the most common features that will help them create robust and less vulnerable passwords.

Keywords

Data Analysis Username-Password Dataset Data Security

References

[1]    Chanda, Katha. (2016). Password Security: An Analysis of Password Strengths and Vulnerabilities. International Journal of Computer Network and Information Security. 8. 23-30. 10.5815/ijcnis.2016.07.04.

[2]    Li, Yue & Wang, Haining& Sun, Kun. (2017). Personal Information in Passwords and Its Security Implications. IEEE Transactions on Information Forensics and Security. PP. 1-1. 10.1109/TIFS.2017.2705627. 

[3]    Cheng, Long & Liu, Fang & Yao, Danfeng. (2017). Enterprise data breach: causes, challenges, prevention, and future directions: Enterprise data breach. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 7. e1211. 10.1002/widm.1211.

[4]    Yıldırım, M., Mackie, I. Encouraging users to improve password security and memorability. Int. J. Inf. Secure. 18, 741–759 (2019). https://doi.org/10.1007/s10207-019-00429-y

[5]     De Cristofaro, Emiliano & Du, Honglu&Freudiger, Julien &Norcie, Greg. (2013). Two-Factor or not Two-Factor? A Comparative Usability Study of Two-Factor Authentication. USEC. 10.14722/usec.2014.23025. 

[6]    Pinkas, Benny & Sander, Tomas. (2003). Securing Passwords Against Dictionary Attacks. Proceedings of the ACM Conference on Computer and Communications Security. 10.1145/586110.586133. 

[7]    Bošnjak, Leon &Sres, J. &Brumen, B.. (2018). Brute-force and dictionary attack on hashed real-world passwords. 1161-1166. 10.23919/MIPRO.2018.8400211.

[8]    2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2018 - Proceedings (2018)

[9]    https://www.kaggle.com/wjburns/common-password-list-rockyoutxt

[10]  https://crackstation.net 

[11]  https://weakpass.com/download 

[12]  https://wiki.skullsecurity.org/Passwords 

[13] Tull, L.. (2002). Library systems and Unicode: A review of the current state of development. 21. 181-185. 

[14] Hahn, Brian & Valentine, Daniel. (2013). ASCII Character Codes. 10.1016/B978-0-12-394398-9.00026-5.

[15] https://github.com/hmaverickadams/breach-parse

[16]  https://www.python.org 

[17] https://github.com/rishab-rb/EDA_Passwords/blob/main/FINAL%20CODE.ipynb

[18]  https://github.com/rishab-rb/EDA_Passwords/blob/main/EDA.ipynb

Cite This Article

Choose your preferred format

format_quote
Jain, Vanita, Swami, Mahima, Bansal, Rishab. "Exploratory Data Analysis on Username-Password Dataset." Fusion: Practice and Applications, vol. Volume 4, no. Issue 1, 2021, pp. 5-14. DOI: https://doi.org/10.54216/FPA.040101
Jain, V., Swami, M., Bansal, R. (2021). Exploratory Data Analysis on Username-Password Dataset. Fusion: Practice and Applications, Volume 4(Issue 1), 5-14. DOI: https://doi.org/10.54216/FPA.040101
Jain, Vanita, Swami, Mahima, Bansal, Rishab. "Exploratory Data Analysis on Username-Password Dataset." Fusion: Practice and Applications Volume 4, no. Issue 1 (2021): 5-14. DOI: https://doi.org/10.54216/FPA.040101
Jain, V., Swami, M., Bansal, R. (2021) 'Exploratory Data Analysis on Username-Password Dataset', Fusion: Practice and Applications, Volume 4(Issue 1), pp. 5-14. DOI: https://doi.org/10.54216/FPA.040101
Jain V, Swami M, Bansal R. Exploratory Data Analysis on Username-Password Dataset. Fusion: Practice and Applications. 2021;Volume 4(Issue 1):5-14. DOI: https://doi.org/10.54216/FPA.040101
V. Jain, M. Swami, R. Bansal, "Exploratory Data Analysis on Username-Password Dataset," Fusion: Practice and Applications, vol. Volume 4, no. Issue 1, pp. 5-14, 2021. DOI: https://doi.org/10.54216/FPA.040101
Digital Archive Ready