# Week 9: Application to Real Data ## Topics This week's assignments will guide you through the following topics: * Computing workflow of CMS * Issues related to real data ## Reading Please read the following: * Read about the CMS event data model (EDM): * Learn about the CMS software stack (CMSSW): * Browse an example pull request to CMSSW: * Skim some papers on the framework and multithreading: , {cite:p}`Jones:2014uza` ## Tasks Propose a project for Quarter 2. Possible extensions include: * Explore other model architectures for the same H(bb) identification task, like transformers, tensor networks, equivariant neural networks. * Expand the model to do multiclass classification (classifying all flavors of QCD quarks, gluons, H(bb) present in the dataset). * Try an unsupervised learning approach like (variational) autoencoders for anomaly detection. * Explore model compression, knowledge distillation, quantization or other techniques to reduce the model size or complexity; Can it be made more efficient? * Perform a regression task, like correcting the energy or mass of the particle jet based on generator-level "truth" information. * Apply concepts you learned to a new dataset like the TrackML Particle Tracking Challenge: . * Apply your algorithm to real data. * Apply concepts you learned to a slightly different jet tagging problem for VBS production of 2H(bb) 2H(WW), which may be used in a real CMS analysis! * Explainable AI for GNNs using layerwise relevance propagation (LRP) or GNNExplainer {cite:p}`gnnexplainer`. ## Weekly Questions Answer the following questions * What are some differences between the CMS event data model and other processing frameworks