Sheriffo Ceesay

School of Computer Science, University of St Andrews. sc306 at st-andrews dot ac dot uk

I am final a year PhD student in Computer Science at the University of St Andrews. I am a member of the Systems Research Group and my work is supervised by Prof. Adam Barker. I am currently working on Performance modelling of Data Parallel Systems and Applications. You can find more information about my research HERE.

I obtained a BSc. in Computer Science from the University of The Gambia and MSc. in Information Systems and Application (Distinction) from National Tsing Hua University in Taiwan. In 2016, I earned MSc. in Data Science (Distinction) from Lancaster University, UK. In my final project at Lancaster University, I worked with Data Scientist at Christie NHS to help investigate and migrate some of their desperate RDBMS data sources to NoSQL and Big Data Systems. After a rigorous feature comparison and performance benchmarking of various big data systems, we decided to use Hadoop, Spark and MongoDB.

I am a committer and PMC member of the Apache Gora Project. Apaceh Gora

I participated in the Google Summer of Code (GSoC) program by developing a benchmark module for Apache Gora Final Report and Codes


My PhD research focuses on understanding the performance of big data and distributed applications running on cloud computing infrastructure. Our approach uses a combination of benchmarking and predictive modelling to infer the performance of an application running in distributed computing environments. Understanding this behavior can be useful for consumers deploying big data applications in the cloud computing environment.

The problem we are trying to solve is broadly defined as follows: Given a cluster computing environment e.g. Hadoop or Spark Cluster, how can we determine the execution time of an application x or the resource needs of an application x. A demonstration of my latest work is available from Performance Modelling of Data Parallel Applications

So, in basic terms the main objectives of my research can be grouped as follows:

  1. Find a way of predicting the execution time of different classes of big data applications running on a cluster of machines
  2. Recommend optimal cluster sizing for big data applications


Sheriffo Ceesay, Adam Barker and Blesson Varghese. Plug and Play Bench: Simplifying Big Data Benchmarking Using Containers. In Proceedings of the 2017 IEEE International Conference on Big Data, pages 2821–2828.

Sheriffo Ceesay, Adam Barker and Yuhui Lin. Benchmarking and Performance Modelling of MapReduce Communication Pattern. Accepted for publication in The 11th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2019) 2019 11th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2019)

Teaching & Tutoring

In the School of Computer Science at the University of St Andrews, I am engaged in tutoring and lab demonstration of the following Undergraduate (Sub Honors) courses

I also work with the University of St Andrews ELT foundation program as a tutor. I teach the following modules

General Interests

There is no free time, but in my free time, I love reading news, reading business and leadership related books. I also watch highlights of football games and I do play sometimes. So if you live around Dundee or St Andrews and want to warm up, please do not hesitate to let me know.

Secondly, I am also very much interested in the politics of my home country. I just can't ignore it!