Call for Participation
You are invited to participate in the IWSPA-AP Shared Task at
IWSPA 2018. The shared task will be on Detection and Analysis of Email nature.
The International Workshop on Security and Privacy Analytics (IWSPA) - Anti Phishing Shared Task will feature an exercise in the field of applied machine learning and text analysis in cyber security.
The participants will be asked to build a classifier that will be able to detect phishing emails from spam and legitimate ones in an "unbalanced" dataset.
In order to make the task relatable to a real world situation, the training and testing dataset will have realistic ratios of malicious and legitimate emails (not 50:50).
A sample training data will be provided and the results will be evaluated on a testing dataset that will be posted a week before the results are due.
We ask of the participants to send us their trained model and the results they achieved on the testing dataset.
The participants are encouraged to use any data in their possession, in addition to the one provided, to train their model.
The participants are also free to use any kind of feature engineering, and any type of classifiers. However keep in mind that the dataset is "Unbalanced".
The proceedings will be published online in the CEUR publication service. This year also we will invite the authors of selected system papers at the Shared Task,
to submit extended versions to a special issue of a journal (Details coming soon).
The overall task description consists of the following:
Use training dataset offered and/or any dataset available online.
Analyze email content (Header, Body, URLs). The emails will be in .txt format.
Preferably come up with new and interesting features and/or use existing ones in the literature.
Build and train a machine learning model or use an already existing one.
Finally report the results based on the evaluation metrics specified in what follows.
A few probable SubTasks: We may post two types of training datasets -
Emails with headers: For this type of dataset, the participants are free to use all the content available in an email to extract information.
Emails with no headers: This task will only focus on the body of the emails. Participants may use any type of information extraction related to the body.
Evaluation Metrics: The evaluation metrics expected are: Confusion Matrix (FP, FN, TP, TN), Accuracy, F-Score, Precision, Recall, Weighted average of recall and precision.
The registration link to EasyChair is HERE!
The deadline is January 23rd, 2018
Organizations wishing to participate in the AP Shared Task track
at IWSPA 2018 are invited to register on EasyChair.
Participants are advised to register as soon as
possible in order to receive timely access to evaluation resources,
including development and testing data. Registration for the task
does not commit you to participation - but is helpful to know for
planning. All participants who submit system runs are welcome to
present their system at the IWSPA 2018.
We will post the details for the training corpus on January 25th, 2018. Stay tuned!
Please consult the IWSPA
2018 Workshop for official dates for the workshop.
The important deadlines for the Shared Task:
||January 23, 2018
|Training Data Release
||Before January 25, 2018
|Test Data Release
||February 25. 2018
|Model + Results Submission
||March 3, 2018
|Start of Evaluation
||March 5, 2018
|End of Evaluation
||March 20, 2018
All deadlines for the shared task
are calculated as 11:59pm Baker Island Time (BIT: UTC/GMT-12).
This is the fourth workshop in the series of workshops on
Security and Privacy Analytics. Increasingly, sophisticated
techniques from machine learning, data mining, statistics and
natural language processing are being applied to challenges
in security and privacy fields. However, experts from these
areas have had no medium in the past where they can meet
and exchange ideas so that strong collaborations can
emerge, and cross-fertilization of these areas can occur.
Moreover, current courses and curricula in security do not
sufficiently emphasize background in these areas and
students in security and privacy are not emerging with deep
knowledge of these topics. Hence, we propose to continue
the workshop that we started in the year 2015 to address the
research and development efforts in which analytical
techniques from machine learning, data mining, natural
language processing and statistics are applied to solve
security and privacy challenges (“security and privacy
analytics”). Submissions of papers related to methodology,
design, techniques and new directions for security and
privacy that make significant use of machine learning, data
mining, statistics or natural language processing are
welcome. Furthermore, submissions on educational topics
and systems in the field of security analytics are also highly
Dr. Rakesh Verma, Professor, University of Houston
Shahryar Baki, PhD candidate, University of Houston
Avisha Das, PhD candidate, University of Houston
Ayman Elassal, PhD candidate, University of Houston
Luis Felipe Teixeira De Moraes, PhD candidate, University of Houston