The objective of this research is to develop dynamic models to forecast bus arrivals at a sequence of stations in real time. The main problem is that the information on the position of buses, based on GPS, is updated at very short time intervals and a forecast of every bus to every station must be produced continuously, which requires handling a massive amount of data that has to be analyzed quickly to produce timely forecasts. These forecasts are posted in mobile media for the public. Since there are natural delays in the collection of the position of the buses and in the posting of information, the analysis has to be extremely quick to provide useful forecasts. We present how we overcame the difficulties inherent to real time data collection and analysis by using a technique based on analog positions, where analog refers to a similar position at a similar hour in a similar day. By using this technique, we reduced the amount of calculations in real time to a minimum, although we required some time to develop a trustworthy dataset where the analogs are sought. We illustrate the technique using real-world bus data in Colima City, México.
Keywords: Bus arrival prediction; Analogs; Non-parametric
Biography: Professor at University of Colima, Mexico, Carlos Hernandez got his PhD at Cornell University in 1997. Member of the National System of Researchers in Mexico and CEO of Montecristo Data Mining, which among other things has developed an implemented a system to forecast arrivals to stations which is delivered to mobile media to some cities in Mexico.