The next major revolution in computer chip technology is now a step closer to reality. Researchers have shown that carbon nanotube transistors can be made rapidly in commercial facilities, with the same equipment used to manufacture traditional silicon-based transistors – the backbone of today's computing industry. Carbon nanotube field-effect transistors (CNFETs) are more energy-efficient than silicon field-effect transistors and could be used to build a new generation of three-dimensional microprocessors. But until now, these devices have been mostly restricted to academic laboratories with only small numbers produced. However, in a new study this month – published in the journal Nature Electronics – scientists have demonstrated how CNFETs can be fabricated in large quantities on 200-millimetre wafers: the industry standard for computer chip design. The CNFETs were created in a commercial silicon manufacturing facility and a semiconductor foundry in the United States. Having analysed the deposition technique used to make the CNFETs, a team at the Massachusetts Institute of Technology (MIT) developed a way of speeding up the fabrication process by more than 1,100 times compared to previous methods, while also reducing the cost. Their technique deposited the carbon nanotubes edge to edge on wafers, with CFNET arrays of 14,400 by 14,400 distributed across multiple wafers. Max Shulaker, an MIT assistant professor of electrical engineering and computer science, who has been designing CNFETs since his PhD days, says the new study represents "a giant step forward, to make that leap into production-level facilities." Bridging the gap between lab and industry is something that researchers "don't often get a chance to do," he added. "But it's an important litmus test for emerging technologies." For decades, improvements in silicon-based transistor manufacturing have brought down prices and increased energy efficiency in computing. Concerns are mounting that this trend may be nearing its end, however, as increasing numbers of transistors packed into integrated circuits do not appear to be increasing energy efficiency at historic rates. CNFETs are an attractive alternative technology because they are "around an order of magnitude more energy efficient" than silicon-based transistors, says Shulaker. While silicon-based transistors are typically made at temperatures of 450 to 500 degrees Celsius, CNFETs can be manufactured at near-room temperatures. "This means that you can actually build layers of circuits right on top of previously fabricated layers of circuits, to create a 3D chip," Shulaker explains. "You can't do this with silicon-based technology, because it would melt the layers underneath." A 3D computer chip, which might combine logic and memory functions, is projected to "beat the performance of a state-of-the-art 2D chip made from silicon by orders of magnitude," he says. One of the most effective ways to build CFNETs in the lab is a method for depositing nanotubes called incubation – illustrated below – where a wafer is submerged in a bath of nanotubes until the nanotubes stick to the wafer's surface. The performance of the CNFET depends in large part on the deposition process, explains co-author Mindy Bishop, a PhD student in the Harvard-MIT Health Sciences and Technology program. This affects both the number of carbon nanotubes on the surface of the wafer and their orientation. They are "either stuck onto the wafer in random orientations like cooked spaghetti, or all aligned in the same direction like uncooked spaghetti still in the package." Aligning the nanotubes perfectly in a CNFET leads to ideal performance, but alignment is difficult to obtain, says Bishop: "It's really hard to lay down billions of tiny 1-nanometre diameter nanotubes in a perfect orientation across a large 200-millimetre wafer. To put these length scales into context, it's like trying to cover the entire state of New Hampshire in perfectly oriented, dry spaghetti." While the incubation method employed by the MIT team is unable to perfectly align every nanotube (perhaps a breakthrough in future years may achieve this?), their experiments showed that it delivers sufficiently high performance for a CNFET to outperform a traditional silicon-based transistor. Furthermore, careful observations revealed how to alter the process to make it more viable for large-scale commercial production. For instance, Bishop's team found that "dry cycling", a method of intermittently drying out the submerged wafer, could drastically reduce the incubation time – from 48 hours to 150 seconds. Another new method called artificial concentration through evaporation (ACE) deposited small amounts of nanotube solution on a wafer, instead of submerging the wafer in a tank. The slow evaporation of the solution increased the overall density of nanotubes on the wafer. The researchers worked with Analog Devices, a commercial silicon manufacturing facility, and SkyWater Technology, a semiconductor foundry, to fabricate CNFETs using the improved methods. They were able to use the same equipment that the two facilities use to make silicon-based wafers, while also ensuring that the nanotube solutions met strict chemical and contaminant requirements of the facilities. The next steps, already underway, will be to build different types of integrated circuits out of CNFETs in an industrial setting and explore some of the new functions that a 3D chip could offer, adds Shulaker. "The next goal is for this to transition from being academically interesting to something that will be used by folks, and I think this is a very important step in this direction," he concludes.
Facebook AI has built and open-sourced BlenderBot, the largest-ever open-domain chatbot. It outperforms others in terms of engagement and also feels more human, according to human evaluators. The culmination of years of research in conversational AI, this is the first chatbot to blend a diverse set of conversational skills — including empathy, knowledge, and personality — together in one system. We achieved this milestone through a new chatbot recipe that includes improved decoding techniques, novel blending of skills, and a model with 9.4 billion parameters, which is 3.6x more than the largest existing system. Today we’re releasing the complete model, code, and evaluation set-up, so that other AI researchers will be able to reproduce this work and continue to advance conversational AI research. Conversation is an art that we practice every day — when we’re debating food options, deciding the best movie to watch after dinner, or just discussing current events to broaden our worldview. For decades, AI researchers have been working on building an AI system that can converse as well as humans can: asking and answering a wide range of questions, displaying knowledge, and being empathetic, personable, engaging, serious, or fun, as circumstances dictate. So far, systems have excelled primarily at specialized, preprogrammed tasks, like booking a flight. But truly intelligent, human-level AI systems must effortlessly understand the broader context of the conversation and how specific topics relate to each other. As the culmination of years of our research , we’re announcing that we’ve built and open-sourced">open-sourced BlenderBot, the largest-ever open-domain chatbot. It outperforms others in terms of engagement and also feels more human, according to human evaluators. This is the first time a chatbot has learned to blend several conversational skills — including the ability to assume a persona, discuss nearly any topic, and show empathy — in natural, 14-turn conversation flows. Today we’re sharing new details of the key ingredients that we used to create our new chatbot. Some of the best current systems have made progress by training high-capacity neural models with millions or billions of parameters using huge text corpora sourced from the web. Our new recipe incorporates not just large-scale neural models, with up to 9.4 billion parameters — or 3.6x more than the largest existing system — but also equally important techniques for blending skills and detailed generation. Chatbot recipe: Scale, blending skills, and generation strategies Scale Common to other natural language processing research today, the first step in creating our chatbot was large-scale training. We pretrained large (up to 9.4 billion) Transformer neural networks on large amounts of conversational data. We used previously available public domain conversations that involved 1.5 billion training examples of extracted conversations. Our neural networks are too large to fit on a single device, so we utilized techniques such as column-wise model parallelism, which allows us to split the neural network into smaller, more manageable pieces while maintaining maximum efficiency. Such careful organization of our neural networks enabled us to handle larger networks than we could previously while maintaining the high efficiency needed to scale to terabyte-size data sets. Blending skills While learning at scale is important, it’s not the only ingredient necessary for creating the best possible conversationalist. Learning to mimic the average conversation in large-scale public training sets doesn’t necessarily mean that the agent will learn the traits of the best conversationalists. In fact, if not done carefully, it can make the model imitate poor or even toxic behavior. We recently introduced a novel task called Blended Skill Talk (BST) for training and evaluating these desirable skills. BST consists of the following skills, leveraging our previous research: Engaging use of personality ( PersonaChat ) Engaging use of knowledge ( Wizard of Wikipedia ) Display of empathy ( Empathetic Dialogues ) Ability to blend all three seamlessly ( BST ) Blending these skills is a difficult challenge because systems must be able to switch between different tasks when appropriate, like adjusting tone if a person changes from joking to serious. Our new BST data set provides a way to build systems that blend and exhibit these behaviors. We found that fine-tuning the model with BST has a dramatic effect on human evaluations of the bot’s conversational ability. Generation strategies Training neural models is typically done by minimizing perplexity, which measures how well models can predict and generate the next word. However, to make sure conversational agents don’t repeat themselves or display other shortcomings, researchers typically use a number of possible generation strategies after the model is trained, including beam search, next token sampling, and n-gram blocking. We find that the length of the agent’s utterances is important in achieving better results with human evaluators. If they’re too short, the responses are dull and communicate a lack of interest; if they’re too long, the chatbot seems to waffle and not listen. Contrary to recent research , which finds that sampling outperforms beam search, we show that a careful choice of search hyperparameters can give strong results by controlling this trade-off. In particular, tuning the minimum beam length gives important control over the “dull versus spicy” spectrum of responses. Putting our recipe to the test To evaluate our model, we benchmarked its performance against Google’s latest Meena chatbot through pairwise human evaluations. Since their model has not been released, we used the roughly 100 publicly released and randomized logs for this evaluation. Using the ACUTE-Eval method, human evaluators were shown a series of dialogues between humans paired with each respective chatbot. They were asked: “Who would you prefer to talk to for a long conversation?” (showing engagingness) “Which speaker sounds more human?” (showing humanness) When presented with chats showing Meena in action and chats showing BlenderBot in action, 67 percent of the evaluators said that our model sounds more human, and 75 percent said that they would rather have a long conversation with BlenderBot than with Meena. Further analysis via human evaluation underscored the importance of both blending skills and choosing a generation strategy that produces nonrepetitive, detailed responses. In an A/B comparison between human-to-human and human-to-BlenderBot conversations to measure engagement, models fine-tuned with BST tasks were preferred 49 percent of the time to humans, while models trained only on public domain conversations were preferred just 36 percent of the time. Decoding strategies, such as beam blocking and controlling for the minimum beam length, also had a large impact on results. After we removed the minimum beam length constraint, the model’s responses were roughly half the length and the performance of our BST models went down, from 49 percent to 21 percent. These results show that while scaling models is important, there are other, equally important parts of the chatbot recipe. In this graph, we show how often human evaluators preferred our chatbots to human-to-human chats over time. Since 2018, we’ve improved model performance in this evaluation --- from 23% in 2018 to 49% today. Over the past few years, w've doubled the performance of our chatbot models through various key model improvements, like Specificity Control, Poly-Encoders, and the recipe described in this blog post. Our latest model’s performance is nearly equal to human-level quality in this specific test setup. This would suggest that we have achieved near human-level performance for this type of evaluation; however, our chatbot still has many weaknesses relative to humans, and finding an evaluation method that better exposes these weaknesses is an open problem and part of our future research agenda. Looking ahead We’re excited about the progress we’ve made in improving open-domain chatbots. However, we are still far from achieving human-level intelligence in dialogue systems. Though it’s rare, our best models still make mistakes, like contradiction or repetition, and can “hallucinate” knowledge, as is seen in other generative systems. Human evaluations are also generally conducted using relatively brief conversations, and we’d most likely find that sufficiently long conversations would make these issues more apparent. We’re currently exploring ways to further improve the conversational quality of our models in longer conversations with new architectures and different loss functions. We’re also focused on building stronger classifiers to filter out harmful language in dialogues. And we’ve seen preliminary success in studies to help mitigate gender bias in chatbots. True progress in the field depends on reproducibility --- the opportunity to build upon the best technology possible. We believe that releasing models is essential to enable full, reliable insights into their capabilities. That’s why we’ve made our state of the art open-domain chatbot publicly available through our dialogue research platform ParlAI . By open-sourcing code for fine-tuning and conducting automatic and human evaluations, we hope that the AI research community can build on this work and collectively push conversational AI forward. Written By Stephen Roller Research Engineer Jason Weston Research Scientist Emily Dinan Research Engineer
Maxime Firth's business is complicated to manage, even in good times. His company, Onduline, turns recycled fibres into roofing material, after dousing them with bitumen to make them waterproof, and sells products in 100 countries. Its eight production plants span from Nizhny Novgorod in Russia and Penang in Malaysia, to Juiz de Fora in Brazil. Further complicating his supply chain, Mr Firth's business is strongly seasonal. People install roofs in the summer, so products are made from January to March, to sell from April to September. The big question for him is how much demand there will be from important markets like China and the US. "Instead of manufacturing something that you are forced to sell, it is better to know what the market wants to buy," he says. The impact of coronavirus makes it difficult for businessmen like him to make the right decisions. To manage demand Mr Firth's company used to work with "homemade" IT tools mainly based on Excel spreadsheets. But now he is using software accessible over the internet (also known as cloud-based) which can model his situation every week. It allows the firm to use the latest data to explore how demand might start returning in different markets. "In terms of profitability, and also production, it's changing every week," he says. � � Coronavirus "puts supply chain planning under the spotlight," says Frank Calderoni, chief executive of Anaplan, whose software Onduline has been using. Some companies have seen sales dry up: like Mr Firth's roofing material purchases, which he says are down 70%. But demand for some goods has rocketed, including groceries, books, coffee, and toys for children. Supply chain chaos could last at least another 18 months, and probably longer, says Len Pannett, president of the UK roundtable of the Council of Supply Chain Management Professionals. � �Image copyrightGETTY IMAGESImage captionSupply chain disruption could last another 18 months experts say Businesses trying to get back to work may find their overseas suppliers are still in lockdown. The more information available about every firm in the chain, the better. "Being in touch with a customer's customer's customer, you can see ahead of time what's coming your way" and start finding alternative suppliers if you need to, Mr Pannett says. Most businesses had been monitoring supply chains, finance, and sales with different tools. Joining these silos into one cloud platform lets finance teams peek into supply chains and sales, and be more efficient with money, he says. And with margins tighter than ever, businesses will need to make better decisions. More accurate real-time information will help them do this, and keep better track of their decisions' effects, according to Mr Calderoni. Supply chains were already trickling onto the cloud and he says coronavirus will accelerate that move, with technologies like blockchain and artificial intelligence (AI) becoming commonplace. � �Image copyrightAPM TERMINALSImage captionSusan Hunter looks after the day-to-day operations of Khalifa Bin Salman Port For the Gulf state of Bahrain - an island - all its ventilators, facemasks, medicines, and 99.5% of the goods in its market come through its only port. The outbreak forced the port to change its procedures, says Susan Hunter, who as head of APM Terminals Bahrain is in charge of Khalifa Bin Salman Port's day-to-day running. The port had to quickly arrange for lorry drivers to apply for gate passes, do security checks, and make payments online. It has also set up a critical cargo programme, to identify containers carrying medical supplies, to allow these to swiftly pass these through customs and put them where they can be accessed quickly. Ms Hunter would like to move all the administration to a blockchain system. "There's no resistance, just 'How are we going to make that happen?'" she says. "We're just a couple of steps away from being able to put a lot of our documents onto a blockchain platform, we are seeing the industry changing that way," she says. Blockchain keeps a record of transactions in a ledger, stored across a number of computers linked in a peer-to-peer network. This lets firms share information about a container just once, but everyone up and down the chain can see that information. It allows "the right person to have the right information at the right time, in a permissioned way," says Richard Stockley, blockchain executive at IBM Europe. � �Image copyrightAPM TERMINALSImage captionBlockchain could make tracking goods simpler Blockchain has made headway in areas like tracing food through a supply chain. Walmart asked IBM to create a food tracing system based on blockchain technology. As an experiment, Walmart's chief executive pulled out a packet of mangoes, imagined they were toxic and asked how long it would take to find out where they came from, and where the other mangoes in that shipment were. Manually, it took six-and-a-half days to find the answer. But using blockchain "we've got that down to about two seconds," says Mr Stockley. The biggest challenge in introducing blockchain to supply chains is getting different organisations to collaborate. "Blockchain is a team sport," he quips. But Mr Stockley says blockchain can make supply chains "a lot more resilient, more transparent, and proactive," and will get much more attention as we emerge from coronavirus. Amazon has changed forever how quickly we expect products to arrive, and how visible their movements should be to us on the way, says Adam Compain, chief executive of San Francisco-based ClearMetal, an AI supply chains startup. But outside Jeff Bezos's company, big corporate supply chains are still pretty static. Typically, every six months, a company will look at how long it takes products to go from China to a warehouse and on to a store shelf, he says. � �Image copyrightAPM TERMINALSImage captionCompanies would like to know exactly where their products are Getting more up-to-date information means making sense of thousands of pieces of information each day about where products are. Much of that information can be poor or conflicting. For example, a delivery company might tell you twice that the same shipment has been delivered, or is out for delivery. But machine learning algorithms can spot patterns in this messy data. Maybe the same delivery company always sends two messages, but the first is generally more accurate. AI is now much better than humans at spotting whether there's a storm brewing now that will delay your shipping container next week, Mr Pannett says. For thousands of businesses like Mr Firth's in France, the coming year will be tough. "Now we know until May we are okay," he says. After that, "we don't know if the customers will pay us." So each week his company, like many others, will make high-stakes decisions using a combination of luck and the best tools technology can offer. By Padraig Business reporter