2024-10-24 –, Hollenfels
The application of Natural Language Processing (NLP) has become increasingly vital for cybersecurity threat intelligence and response strategies today. NLP plays a crucial role by enabling more accurate and nuanced analyses of potential threats through linguistic techniques. Among other applications, NLP allows quicker categorization of threats based on their nature – such as phishing schemes or anomalous behaviors – and enables prioritizing responses accordingly. NLP can also facilitate the development of content prediction schemes for analysts or provide powerful information extraction tools. We will cover two text-mining techniques that we believe are a good starting point with NLP for analysts and incident responders: sentiment analysis and Named Entity Recognition (NER). While sentiment analysis reveals underlying emotions or biases in social media content potentially linked to malicious activities, NER identifies critical information such as IP addresses, domains, and user details essential for correlating incidents across different data sources.
The workshop provides a hands-on, iterative deep dive into transformer-based NLP techniques and their applications in text mining and generation for cybersecurity threat intelligence and response strategies. It is dedicated to people who have already an experience using natural language processing and LLM or LLM only, to deeper their understanding and skills.
Program:
- Quick Introduction to Transformers, best current models
- Hands-on: Text Preprocessing and Tokenization
- Text-preprocessing
- Transformer-Based Sentiment Analysis
- Choose and load a pre-trained transformer model
- Step-by-step building of an NLP pipeline using transformers library
- Run the sentiment analysis task on an imported dataset
- Same adapted the pipeline to Named Entity Recognition (NER)
- Results interpretation
- Same adapted pipeline to text-generation
- Compare basic and light models (e.g., BART, T5, Llama)
- Discussion: Applications in Cybersecurity
- Apply transformer-based NLP techniques to cybersecurity problems
- Limitations and future directions of transformer-based NLP in cybersecurity
By the end of this workshop, you will have a deep understanding of transformer-based NLP techniques and their applications in text mining and generation for cybersecurity. You will be able to apply your new skills to real-world problems and develop practical solutions for threat intelligence and incident response. You'll be able to work directly on the code and scale your analysis.
Familiarity with Python programming is expected. Prior experience with deep learning libraries such as PyTorch is a plus, along with practice of LLMs (with frontend).
Pauline's focus gravitates towards offensive cybersecurity, artificial intelligence, and programming culture. She has a background with experience in various fields including linguistics, criminology, cybersecurity, computer engineering, and education. By blending together approaches from humanities and deep technical insight, she provides a unique lens on cyber threats and their evolution. She provides these days AI developments and trainings, to make AI accessible to all. She is the founder of the Defcon group Paris and a French vice-champion para-climber.
William manages the technical team behind AS197692 at Conostix S.A. in Luxembourg. He’s been working in cybersecurity using free and opensource software on a daily basis for more than 25 years. Recently, he presented his ASN.1 templating tool at Pass the SALT 2023 in Lille. He contributed to the cleanup and enhancement efforts done on ssldump lately. He particularly enjoys tinkering with open (and not so open) hardware. Currently he likes playing around with new tools in the current ML scene, building, hopefully, useful systems for fun and, maybe, profit. When not behind an intelligent wannabe machine, he's doing analog music with his band of humans.