Today, I discussed the number of police-involved shootings by station with my professor. I tried to implement a Pareto chart using a data group that included police station names and the number of incidents each station was involved in. However, unlike my professor, I sorted the data in ascending order (from the lowest to the highest number of shootings). Upon witnessing the chart, I realized that this approach was incorrect. To apply the Pareto chart effectively, the data must be sorted in descending order.
After that, my partner and I continued exploring the dataset and performed data munging by removing unnecessary columns such as race_source,latitude, longtitude, location_precision, name, id. We noticed a gradual increase in the number of incidents from 2016 to the present. Probably we may need to investigate which features are associated with this rise over the years.