Data Analysis Software Engineering

DS04
Semester: 1st,
ECTS Credits: 7.5

George Kakarontzas

(Course Coordinator)

Syllabus

  • Analysis of large-scale data applications
  • Design of large-scale data applications
  • Scaling of large-scale data management applications
  • Code and data auditing and quality assurance of large-scale data applications
  • Introduction to Python
  • Using Python for large-scale data analysis applications
  • Examples of large-scale data analysis applications using Python
  • Use of version control systems
  • Software project management for large-scale data applications

Recommended Bibliography

  • Ali Davoudian and Mengchi Liu: “Big data systems: A software engineering perspective”, ACM Computing Surveys, Vol. 53, No. 5, Article 110, September 2020
  • Catherine Nelson, “Software Engineering for Data Scientists”, O’Reilly Media Inc., 2024
  • Miryung Kim, “Software Engineering for Data Analytics”, IEEE Software, vol. 37, no. 4, pp. 36-42, July-Aug. 2020
  • M. Kim, T. Zimmermann, R. DeLine, and A. Begel, “Data scientists in software teams: State of the art and challenges”, IEEE Transactions in Software Engineering, vol. 44, no. 11, pp. 1024–1038, Nov. 1, 2018.
  • Nikolay Sydorov and Nika Sydorova: “Software Engineering and Big Data Software”, Problems in Programming № 3-4, 2022
  • Nazim H. Madhavji, Andriy Miranskyy, and Kostas Kontogiannis: “Big picture of Big Data software engineering: With example research challenges”, in Proceedings of the 1st International Workshop on Big Data Software Engineering, pp. 11–14, IEEE Press, 2015.
  • Vaibhav Sachdeva and Lawrence Chung: “Handling non-functional requirements for Big Data and IOT projects in scrum”, in Proceedings of the 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, pp. 216–221, IEEE, 2017