Data-driven framework for understanding and predicting air quality in urban areas