Things that haven't told you.

Behind Tencent Google app

jikegongyuan· 2017-01-01 04:48:12

" Author: blackboard on duty

this article reprinted from AI Technology Review (ID: aitechtalk), has been authorized reprint geeks park. The author according to the Tencent big data summit and KDD China Technology Summit lecture finishing, and by Dr. Ye Jieping and drops CTO Zhang Bo personally review.

Ye Jieping, vice president of Research Institute, University of Michigan, tenured professor. Ye Jieping is an international leader in the field of machine learning, which is mainly engaged in the field of machine learning, data mining and large data analysis, especially in the study of large-scale sparse model in the international leading position.

machine learning includes large-scale application in the travel destination drops: prediction, optimal path planning, carpool matching, order allocation, scheduling, capacity assessment, evaluation system etc..

- it's all about you, and every day you may use, but never know, the story behind the machine.

build drops of traffic brain

last year set up a machine learning institute, then renamed the Institute of drops.


App" page open in App drops travel, home page contains a lot of artificial intelligence:


we will first predict the precise positioning of the user's location below is the user to the destination. In many cases, we can predict where the user is going: because a lot of travel is more regular: work in the morning, go home at night. We use the user's travel data from time and place to predict the user's destination, which is a manifestation of artificial intelligence.


we also have a very common price forecast is a very complex computing process, involving the path planning and time estimation (ETA). From the beginning to the end of the path planning is a very core part, to find the best path, we need to calculate the distance from A to B. Then set out to solve the time required to estimate the journey: starting to the end of 20 minutes or 30 minutes. Combining the path and time, we give an estimate. Machine learning


carpooling option is also very complicated, we need to calculate the user clicks after carpool to a spell probability in the process of starting point from the end point. If the probability is not large, this passenger is likely to have a person from the beginning to the end of the seat, and the drops will be given a discount, such as ten percent off. If this is a popular route, there are likely to be other passengers on the way to the same place or near you at the same time. In this case, we can play a little more than a little discount. Application of artificial intelligence

passengers and the driver after the car, called

" when the user confirmation after the cab, drops to do orders, and find out the most suitable with the user's driver. This process is also a series of machine learning problems.

so how to balance the order is not appropriate, there are a variety of ways to solve: for example, distance and time from your nearest driver. Of course, the balance of the problem behind the order also includes personalized search, such as individual users may only like a certain type of vehicle, a certain type of driver. In particular, the user in the middle of the night eleven female two points, may be relatively high demand for models and drivers, which requires personalized matching.

if the user chooses to carpool, how to find the most suitable system of a car, the car may be empty, there may be a manned vehicle, at the same time, the time to calculate the A B. Figure

thermal here encounter a situation, new drivers want to empty time less, but often do not know where the orders, then drops will give to a thermodynamic diagram, tell the driver what area for half an hour in the future, there may be many orders. The core of

drops of artificial intelligence: the core of the things

bit Research Institute is currently doing the order allocation. At some point there are thousands of passengers, but also thousands of idle vehicles, we have to complete the optimal matching of drivers and passengers, the trade-off is the degree of matching. The most simple way to calculate the degree of matching is to use the distance to evaluate, drops in the previous few years is to match the distance. However, there are still a lot of unreasonable road distance calculation, because the various sections of the situation is different, some places are particularly blocked, while others are the same, but also a kilometer, but the travel time may be completely different. There is an urgent need to increase the dimension of time. The calculation time is a big problem, even more difficult than the estimated distance.

so the order to achieve the best match to achieve the following two points: to make the best path planning.

, a large-scale

to calculate the order time and distance, will encounter a problem: due to large amount of data drops, every passenger is not just for a driver to match, but need to match with the surrounding hundreds of drivers. At any one time, the amount of matching drops up to 10 million or more, in one or two seconds to complete the tens of thousands of path planning, which is a very big challenge.

this decision is different from the search, with the results of Google search, after 10 minutes, the results are still the same as before. The drops in the match, even if the delay of two seconds the driver may have a crossroads, making the path planning situation is completely different. We now set up a machine learning system, which contains historical data and real-time data, as long as there is a bit of place, we know the speed and road traffic. Then find the features, establish the system, but also can be used to do depth learning path planning and time estimation.

Research Institute has established a set of deep learning system, and then add the traffic and other information to predict, this should be the first to do with deep learning path planning and time prediction system. Simple comparison of the results, last year began to use machine learning to the nearest depth of learning to make the error is reduced by about 70%.

next need to do the best match, there are many different ways. Drops of a taxi, train, car, luxury car and so on several lines of business, whether the drops of each line of business to get through? For example, users called express, but may not express around the driver to pick the user, that there is no possible use algorithms to make decisions, at this moment let car or taxi drivers to meet the users, the optimal scheduling scheme to do a global matching, give full play to the advantages of drops.

in Beijing, the peak of all taxi difficulties may be considered to be due to capacity is not enough, but the analysis found that in the peak capacity is actually drops enough, mainly because of the unreasonable distribution of vehicles.

for this we developed a system that divides the entire earth into numerous hexagons. Each moment in the detection of each hexagon, and then calculate the number of orders and the number of empty cars in a hexagon, calculate whether the balance of supply and demand.

capacity to solve the problem of

driver without him should be in place is a big problem we need to solve, if there is a platform to grasp all the information, so it can make the best decision, optimal scheduling and navigation decision. The first way to solve this problem is the dynamic price adjustment. We are also exploring in two ways:

supply and demand forecast, capacity scheduling: how to complete the prediction, we need to restore a scene, such as a general assembly at the end of 6 in the evening, many people will have a taxi demand, which is a reflection of the forecast. In addition, we mentioned common people to travel regularly, so it can predict a region of a moment, may lack the number of vehicles, so we advance 15 minutes or half an hour to do scheduling, the excess capacity of the past from the surrounding, to alleviate supply and demand problem. Supply and demand forecasting is involved here, and supply and demand forecasting is essentially a prediction of time series.

car: if two passengers travel and travel time, without the two driver to pick up, but the two order integration as a combination of orders, with a driver to pick. Carpooling involves a very important problem is the user experience: the user experience is reflected in two dimensions, one is cheap, two is on another journey and time not too much people around. We want to integrate the two orders, the path is similar. To this end, we establish several machine learning models to estimate the degree of path matching.

forecast passenger experience

after the end of the trip, we also need to predict the passenger experience is good or bad. Because some of the passengers in order to historical complaints, for example, poor, carpool detour. And some users will give praise. We learn from a large number of historical data which is the cause of the characteristics of passenger complaints, which will lead to high praise. The most important thing is

carpool pricing, which uses machine learning and optimization algorithm. The core idea is very simple, if the passengers made a carpool single, we will predict the probability to find the big fight friends for the starting point to the end point of its passengers, matching degree? If it is predicted that he will be a great probability that he will go from beginning to end, the discount will be lower, otherwise it will be higher. In addition to

, we also do a lot of work on the image. For example, the driver's license image detection, identification number, etc., so that the driver does not need a lot of procedures to the office can be solved.

" we can put the drops as a search engine, passenger driver search. Baidu search information is different, after the end of the Baidu search, there is no other follow-up questions. But passengers in the search for a good driver drops, drops need to ensure safety and travel experience. So we introduced a machine learning system in the near future, to predict the quality of service and service attitude of the driver, to measure the good or bad service needs to be analyzed by a large number of passengers scoring, comment data.

in the past drops and Uber stars rating system, and later we found that the function is not perfect. The reality is that users do not score, or to a higher score of five or four, making the star rating function is not effective enough.

this is essentially a user habit problem, in order to make a more comprehensive scoring system, the platform to leave all traces of the passengers are integrated together, and then give a score evaluation. For example, passengers hit star, and text evaluation attitude is very poor, so the two dimensions of the detour, passengers are given information, we can according to the track and a number of data, and then gives a comprehensive score. The higher the score, the higher the driver's income will also ensure that drivers to improve the quality of service.

there is another problem, that is, the passengers to the driver to write bad comments. In response to this situation, we have established a machine learning system, which is able to judge the driver's responsibility or the responsibility of the passengers. If the responsibility is not on the driver, we will not reduce its scores. After the system has been on the line after the driver satisfaction has been significantly improved. Visualization of

system and finally we mention about the visual system is very important, this system will be able to see what happened in history order during the trip, such as what areas are transactions we're interested in, the high rate of orders. The second is the changes of regional, orders rose as early as the peak, evening peak orders fell, the response rate in the morning and evening peak is very low, usually may be very high, we can quickly know each region, each time the situation.

above for the region, we can also have a dimension of the city, such as the city about how many orders? About how many drivers? Passenger orders issued demand turnover rate of about? We can grasp the past and present situations, the driver to see real-time hotspot location, guide the driver to go along hot spots, reduce Kongshi time and improve the efficiency of the platform.

in addition, we can also see the cross city time, especially before the Spring Festival and other holidays, because there are a lot of people will carpool home. To this end, we will find some special areas, alone to analyze what happened to it. The

system also allows you to see the city each time the imbalance of supply and demand situation: what regional oversupply, which areas of demand exceeds supply, the area in which the balance of supply and demand, and the past and now what happened. In response to these phenomena, we need to find the reasons of low response rate and low turnover rate.

The lastest articles of jikegongyuan