Pregnant front-end intelligence

Although the contact recommendation algorithm is 2009, the first set of artificial intelligence teams is at the end of 2014. Experienced the application issuance, advertising, information flow and browser, seeing the huge value brought by manual intelligence to the business, is thoroughly conquered because of participating. At that time, I firmly believed that artificial intelligence is the most powerful force since the computer will completely change the world from every corner.

Starting from 32K cats and web-three swordsmen, always love the front end, looking at the artificial intelligence era, germination has helped the front end to enter the artificial intelligence area. On April 2018, on the Cooperation of the Front End Committee, “front-end intelligence” direction is proposed. Design and planned until this direction landed, it has been more than a year. I found it in practice that I don’t understand the front-end intelligence, I don’t do it, it is the biggest obstacle to the front-end engineer to participate.

I don’t understand that it is mainly because of the lack of front-end intelligent definition, everyone does not understand what I am talking about, what to do, many friends I have shared strongly ask me to provide some learning suggestions and information. I have tried some documents, but I always feel that I can’t make it clear about the front-end intelligence. It is necessary to comb my practice and thinking summary for everyone.

Contrast intelligence and front-end intelligence

“Front Intelligence” is the front-end technology under artificial intelligence, just like machine learning is a subset of artificial intelligence, front-end intelligence is a way to achieve intelligence in the front-end field. Artificial intelligence applies in front-end field, itself stepping from theory is engineering technology. Because of the artificial intelligence theory research process involves computer software assist, academic and technology mutual conversion.

Under the industry, the “Algorithm Engineer” has produced the “Algorithm Engineer”, and has experienced the special sex-transition of this position with a large number of business practices from the Zero Grouping algorithm team and a large number of business practices. Machine Vision (Image Classification), Voice (Branch Flying), Enhanced Learning (Openai Alphago), Natural Language Understanding (Google Word Vector) … In the direction of research, the major breakthrough in research, the industry and academics are active Promote the industrial floor landing of academic achievements, algorithm engineers shoulder the courage of academics and industries. However, in essence, the academic and industries are significant differences. In addition to the special research institutions under the Microsoft Research Institute, Google Research Institute and other enterprises, at the work of the business algorithm engineer, most of the time is the use of engineering technical means to land academic results, though There are many papers, and the academic atmosphere is rich, but it is really possible to generate actual value in business. Therefore, the academic research capacity required in the application area is limited, only long-term research improvements, and the minority academic achievements that have been verified by industry are mostly applications. Select object. Therefore, front-end intelligence first should pay attention to the mature academic research results in the industry. The experiment of learning the paper is only used to judge the development trend and context of technology, focus on building engineering capabilities around academics and technology.

Front-end intelligence does not need to learn too many underlying mathematical principles and algorithm details, excellent academic research results are basically a large number of research, calculations, practice, verification, optimization, unless it is a genius, it is difficult to propose underlying principles innovation. First of all, the focus should focus on the problem definition, and what is the problem that the artificial intelligence is solved? (Whether it can use artificial intelligence to solve and solve the degree of determination) What is the key point of the core and the origin after the “first principle” analysis (Incident and abstract application direction for problems) These key points belong to which area of ??manual intelligence? (Which academic research area is looking for reliable academic results) Is there a sufficient, high-quality annotation data to describe the problem? (Supervise or no supervision learning) Second, the appropriate model and appropriate training method are found, and the data of the description of the problem and the model requires data, and training it with appropriate method. Finally, with the existing front-end engineering link to deal with the model’s ability to verify the model service capabilities and effects in the business, continue to optimize the model according to the data generated by the usage process.

Compare artificial intelligence, “front-end intelligence” is the engineering application technology and method for artificial intelligence in the front-end field, reducing artificial intelligence application engineering costs, improving engineering quality, and promoting intelligence to generate value, this is the core problem of front-end intelligent attention.

Contrast front and front-end intelligent

Front-end intelligence is a complete set of new methodology, helping front-end technology to upgrade to intelligent front-end technology. In the process of driving intelligence in Ali, it is often asked: Since Python can systematically solve the problem of machine learning, why should I use front-end technology in the front-end technology? Yes, Python is indeed unlatable new Jinba, traditional C / C ++ and Matlab also have rich technical ecology in the field of machine learning, but in these technical fields, understanding and uses artificial intelligence and machine learning, for driving the front end Technology is not significant in the “AI era”. In the end, it is still necessary to return to the technical field of the front-end technology to understand and use artificial intelligence. On the one hand, it can think about the value of artificial intelligence, on the other hand, with artificial intelligence to help front end engineering system and front-end technology The system is growing and developed. Compared to the complex technical and engineering systems in Python, it is more practical to develop intelligent transformation and development based on the front-end technical system.

For example: node.js also have a huge controversy, Node.js still exists on the meaning of the front end and the no value of the value, in fact, Node.js as JavaScript technology is extended in the server, greatly enriched the whole The technical ecology of the front-end technology and engineering field solves the problem of “who is most understanding who defines”. In the past, front-end engineers could understand UI interactions, but the business could not define configurations and data behind the UI interaction. At first, this problem was not produced in the application scenarios of the template technology, and there were no more problems in the application scenarios of the front and rear ends. Since then, as the dynamic capacity requirements are improved, the system complexity has gradually enhanced, and the communication and collaboration between the traditional programs, the cost of communication and cooperation between the front and rear ends increases R & D costs, reducing R & D efficiency and availability. Node.js gives front-end engineers cross-technical field, cross-platform-proof business logic, code logic’s ability, Node ecology is booming, ushered in the front-end engineering performance gold era, “front-end intelligent” is also based on Node.js The field solves the problem of “intelligent” technology applied, as a first-line technology that is most understanding business scene and user, engineering and integrates artificial intelligence technology, bringing user value back to the front line. The latter should be a new species that grows on the ecological soil of the former. The front end intelligent can enrich the technical perspective, technical ideas, and technical means in front-end technology. The front-end intelligent is outside the front-end technology, which can expand more application scenarios, give business more value, shortened and other software technology.

Front-end intelligence problem

Looking at a technical field is like a business development, Ma Huateng recalls WeChat growth development processes: It is better to be killed by others, and it is better to tell himself by others. Any technique is like a company, and JavaScript acts as a programming language. As a technical field, the front end is a vitality, which is derived from practitioners in this field, constantly innovating, and constantly innovation.

Because of the browser, the development of the industry belongs to the duck, the chilly and front-end technology brought by the mobile side is fully resolved in the middle background, and the front-end technology application scenario is released. Frostical innovation and counterattack, starting from RN and WEEX, PWA and High Level Web APIs are optimized for Web container performance, TypeScript strong type and large project code organizational capabilities, small The procedure in private domain traffic, Node.js brings an upgrade of service capabilities and engineering capabilities, and the mobile app brings the impact of scene reduction, and is eased by the development of the front-end technology itself. The so-called sky-making roof, future technology development trends and threads? What is the impact and opportunities brought to front-end technology?

Old: User – Front End Technology / Client Technology – Service

New: User – Front End Technology / Client Technology – Edge Calculation Frame – Model – Service

For older ways, front-end technology is a bridge between users to serve, and the means of connection is the UI, interactive logic, business logic, code logic. Today is a new way, with edge computing frames Alinn, NCNN, XNN, Tensorflow Lite …, etc., models have been assigned to identify users and identify the scene. Today, with ml kit as carrier, face recognition (brush face authentication), limb recognition (embarrassment), speech recognition (voice interaction), voice synthesis (digital broadcast), image identification (machine vision) … In the client, the client is constructed, and the client is already using these model capabilities to change the R & D method. When artificial intelligence technology continues to develop, she is more than just calling this API or which API is called, but the ability to combine artificial intelligence technology into the UI interaction field, serving the user’s technical problems with new ways.

Combining artificial intelligence is a trend. When the equipment is directly authenticated with the camera and model face recognition ability, the traditional landing process will gradually abandon, just like today’s landing method has become more and more mobile phone number plus SMS verification code. In the past, the UI logic process, open the home page will turn to: XX users open the home page in the XX scenario, the model may also provide the operation path of the user’s habits and the way to obtain the service, the previous extensive: load the template, pull the source, get the style The way the rendering page is clearly unable to meet the requirements, all steps will be added with many conditions, such as loading white-collar women’s young template, get XX resources according to the XX scene, get XX style according to XX scene … Waiting for a series of additional conditions, let UI more Know users to provide more personalized services. This is never changed to the API and involves an upgrade for the entire technical system and engineering system.

By summing up, you can focus on three issues: mobile side scene complex, new and old UI and interactive technical systems are not compatible, and the cost of research and development costs brought by personalized demand. The technological innovation brought about by artificial intelligence is presented on how the specific front-end technology work scene will cause changes.

Mobile Scene Compassness is too high PC age, home and office is the main scene, hardware computing capacity is limited, and the software technology is behind, the upgrade of a character interface to the graphical interface is enough to sensationalize the entire industry and even the world. With the development of hardware and software technology, the progress of Internet technology has gradually incorporated into hardware and software products in addition to traditional production and entertainment. The mobile terminal is essentially putting the PC and the Internet, and people keep connecting with the Internet and the Internet, while continuous innovative hardware, can use powerful storage and calculation capacity anytime, anytime, anywhere. This change comes completely from the space, and the home and the office are completely expanded as the main scene. Unless it is explicitly prohibited, the space may be the scene of the user using the mobile terminal anywise. From the time, the portable mobile device almost sees the user’s attention from early morning, the party is free to play the mobile phone. The low-headed people visible on the road … even if the battery can’t meet the high-load capacity of the whole day, everywhere Brush a charging treasure to return to the battery. When a technique can be used anytime, anywhere, then the scene designed in this technology must be very complicated.

In addition to the above time and space scenarios, there are many elements that make up the scene, with a variety of possibilities. I still remember that when you study narrative writing, the basic writing elements are: time, location, character, plot, environment. The scene can be understood that a person is interacting with a particular time and space. In addition to time and space, the participant’s characteristics, the environment is complicated and dynamically changed, and different participants in different ways to interact with different environmental interactions are also complex and dynamically changed. Suppose a user passes a restaurant at the lunch time, received the promotion to remind this restaurant with coupons can be used, users enter the restaurant to fall, scan the code order, after the meal, scan the code. In this scenario, the user’s continuous walking route, consumption, the sensitivity of the discount, the change of the taste, the funds …, etc., will make this seemingly simple meal scene, evolve more complexity and scenes Variety.

According to the perspective of conventional technologies, the above scenarios can be completed based on time, geographic location, push, activity, two-dimensional code, electronic payment, etc. According to the intelligent perspective, how to use the live sensor data on the current scene mobile phone, how to get offline non-real-time data on the server, use artificial intelligence to understand these data and serve the current scene, give users more personal Service, better experience. The sensors on the moving end include: image sensor, light sensor, inertial accelerometer, magnetic field sensor, four-axis gyroscope angle sensor, audio sensor, touch sensor, structural light depth sensor, 3D visual sensor …, etc., data of these sensors Real-time delivery model, different models generate specific functions according to a single or combined sensor data, such as image sensors and structural light depth sensors, 3D visual sensors, with model generating a face recognition function, predicting whether the current sensor input is in line with local Encrypted stored feature data, which is based on face recognition authentication. For example, the inertial accelerometer and the four-week gyroscope angle sensor together, by the model, the inertia and angle sensor data when running during training, master the characteristics and mode of inertia and angular data during run, the future model can be based on Real-time inertia and angle sensor data to determine if the user is running, and those who are sticking to their mobile phones in the washing machine will not be mixed. There are still many examples, with the increase of the sensor, more accurate data, more real-time, these arrangements combinations will gradually become the extension of human sensational ability, just like foreign bones to extend the ability of human limb.

The model translates the original sensor data into valuable understanding and judgment, it seems to find bug or performance optimized spider silk marts, 10,000 questions may be seen in the eyes of 10,000 questions, and the model will also be different from the data. Understanding and judgment. For an adult, a male adult, a male adult in IT work … Different depth abstractions, a rich and impulsive adult man’s abstract way, the model will generate more and more Complex understanding and cognition, these understandings and cognition will start from the perspective of the participant and make the scene more complicated.

Interaction is not compatible

In addition to scenes and participants, the other will push complexity to the ultimate interactive behavior, which is the largest part of the future technology challenge. The main interaction mode of the PC era is the keyboard and mouse. The keyboard determined by the keyboard brings the participant’s physical strike into the digital world. The mouse will bring the participant’s physical operation into the digital world, simple, clear , Directly, unified input methods greatly simplify the difficulty of UI and interaction design and implementation. The mobile terminal extends the input habit of the traditional keyboard and the mouse, respectively, but the face recognition basis of the mobile terminal is input, the body recognition is input, the gesture input of motion recognition is input …, The complex input method requires rethinking technology implementation in scenarios, users, interactive methods.

Alibaba double eleven has a smile-fighting activity, with the help of Alinn’s powerful mobile machine learning acceleration capacity, real-time detection of the user’s expression, matching the user’s expression and Emoj. This model based on the application of the face recognition has a broader imagination. In addition to this small game, you can also abstract the expression into an interactive way. For example, the frown represents cancel, smile indicates that the user interacts the behavior It is not as simple as the BOM or API, but it needs to understand the prediction results given by the model, and make a correct response to different forecast results, rather than simply thinking of interaction behavior as the past Ontype, onclick is so simple, The current front-end framework and technology have not formulated this change in the technical solution. Personalization demand increases research and development costs

The square of society is all cost and efficiency balance. When traveling everyday, it is undoubtedly the most efficient driving. It is possible to take a lower cost, after all, don’t buy a car, pay the parking fee, but ask for a good time, the efficiency is better to buy a private car. The subway, the bus is pushed in this class, and the efficiency is lost while reducing cost. The way to reduce costs here is “publicization”, the higher the “personalization” efficiency, the more costs behind, the lower the “public” efficiency, the lower the cost behind it.

From a business perspective, after the traffic of computer software and Internet industry reaches a certain extent, the dividend of population dividends and flow growth gradually disappears, and the enterprise is more focused on efficiency, how many values ??can be generated each time you come into contact with the product and service? A news app, if the user opens every time you see, you are not interested in content, an e-commerce app, each open is a unlike product, you can’t produce user value. Nowadays, the technical capabilities are like “publication”, tab to the user, tag content, then the machine learning, collaboration recommending various algorithms, but always enhances the efficiency to “a group of people”, because Each user is too high.

Solution ideas

In the past, with the dynamic capacity to respond to complex scenes

The era of the web page three sword passengers, the page is basically static, “D” in “dhtml” is just the dynamics of the user interaction. In order to get rid of a high R & D cost, the dynamic technology is produced, and the development cost is greatly reduced by means of template technology and dynamic data filling. More traditional desktop applications and software engineering experts join the front end, ultimately turn the way-by-page development ways to develop similar development methods. Use the control to present data to the assembly, build a page, single page applications push dynamic technology to the peak. Data and style changes in demand change are dynamically decoupled, do not or a small amount of modification of code, do not have a small amount of release to meet product needs.

The complexity of programming will never disappear and will only be transferred. According to the dynamic technology analysis, the stronger the dynamic capacity, the more flexible architecture and procedures are required to design and implement, the more flexible, the more complex the development process, turn the arrow to the other direction, no need for too much design according to each Demand is implemented for each page. It is almost completely unusually unflexible at the beginning of the design. It can only be rewritten when the demand changes. Back to the response to complex scenes, whether the flexibility can match complex scenes, demand change and new demand can be good implementation, is a balance complexity and flexibility, even increasing flexibility, because new scenes appear The speed is not progressive, many scenes have suddenly burst out of human body recognition, making the entire technical system can’t prevent it.

In-depthness of demand, determines the predictive accuracy of future changes, and cannot accurately predict demand changes will not be possible to determine whether today’s technical reserves can respond to future changes. The technical reserves here are in practice, including preparatory frames, scaffoldes, libraries, templates, themes, components, controls, etc. Failure. For example, in the scene of VR, even if there is 3D related accumulation, there is no in-depth understanding of the VR scene, it is difficult to use the existing technical reserves, good implementation of a VR scene, just like Vive playing video and browsing the web, just 3D space forcibly rendering a 2D interface, it is difficult to serve this scene, which requires a lot of research, practice and thinking, such as the occlusion problem of 3D space, virtual object and UI combination, 3D scene interactive problems … … as long as these scenes use the current technical system to practice, it will expose a large number of places that need to be reconstructed.

Continue to see the example of VR, is it possible to expand in the prior art to overcome the technical problems caused by scene complexity? First review the current situation, even if there is a large Liuhai, Xiao Liu, big eyes, big chin … There are also many shaped screens, but the general can summarize the UI to 2D polygons. Entering the VR scene, the UI may be combined with any shape, it may be a 3D sand table, or may be embedded inside the transparent material, may be suspended in the air, and the complex scene response method is to use the front-end intelligent thoroughly refactors and Renovation.

Now use the model to feel the simplified scene complexity

Scenes with small-scale test water in operation and marketing activities, will become an infrastructure in the future. The mobile Internet really puts the Internet in the user pocket, so that everyone keeps and connects to the network anytime, but the tariff is high, and it is difficult to make a change in the industry. Unlimited traffic, 3G and 4G high network speed, acceptable package prices, jointly determine whether the mobile Internet can become an infrastructure. Today, I read the red envelope is only a marketing activity. The mobile phone manual intelligence frame is not mature, the connection of model capabilities and software capabilities is not close enough, lack of technical precipitation and accumulation of research and development cost. With the continuous improvement of hardware and software technology, the understanding of artificial intelligence is constantly in-depth, artificial intelligence and software capabilities will be more closely robust, and the R & D cost will gradually decrease, and finally, the face and expressions like double eleven expression. Identification will become an infrastructure that builds business scenes. With the improvement and development of infrastructure, the scenes involved are also more complex, and the business will gradually play every drip potential of technology.