French startup Snips is now helping you build a custom voice assistant for your device. Snips doesn’t use Amazon’s Alexa Voice Service or Google Assistant SDK — the company is building its own voice assistant so that you can embed it on your devices. And the best part is that it doesn’t send anything to the cloud as it works offline.
If you want to understand how a voice assistant works, you can split it into multiple parts. First, it starts with a wakeword. Snips has a handful of wakewords by default, such as “Hey Snips,” but you can also pay the company to create your own wakeword.
For instance, if you’re building a multimedia robot called Keecker, you can create a custom “Hey Keecker” hot word. Snips then uses deep learning to accurately detect when someone is trying to talk to your voice assistant.
The second part is automatic speech recognition. A voice assistant transcribes your voice into a text query. Popular home assistants usually send a small audio file with your voice and use servers to transcribe your query.
Snips can transcribe your voice into text on the device itself. It works on anything that is more powerful than a Raspberry Pi. For now, Snips is limited to English and French. You’ll have to use a third-party automatic speech recognition API for other languages.
Then, Snips needs to understand your query. The company has developed natural language capabilities. But there are hundreds, or even thousands of different ways to ask a simple question about the weather for instance.
That’s why Snips is launching a data generation service today. I saw a demo yesterday, and the interface looks like Automator on macOS or Workflow on iOS. You define some variables, such as “date” and “location”, you define if they are mandatory for the query and you enter a few examples.
But instead of manually entering hundreds of variations of the same query, you can pay $100 to $800 to let Snips do the work for you. The startup manually checks your request then posts it on Amazon Mechanical Turk and other crowdsourcing marketplaces. Finally, Snips cleans up your data set and sends it back to you.
You can either download it and reuse it in another chatbot or voice assistant, or you can use it with Snips’ own voice assistant. You can also make your capability public. Other Snips users can add this capability to their own assistant by browsing a repository of pre-trained capabilities.
Step 1. Create Intent
Step 2. Choose datagen package
Step 3. Confirm results
A Snips voice assistant typically requires hundreds of megabytes but is quite easy to update. After installing the Snips app on your device, you just need to replace a zip library file to add new capabilities.
You also need to implement the actual actions. Snips only translates what someone is saying into a parsable query. For instance, Snips can understand that “could you please turn on the bedroom light?” means “light + bedroom + on.” A developer still needs to implement the action based on those three parameters.
Developers are already playing with Snips to test its capabilities. But the company hopes that big device manufacturers are going to embed Snips into their future products. Eventually, you could think about a coffee maker with a Snips voice assistant.
Compared to Amazon’s or Google’s wide-ranging assistants, Snips thinks that you don’t need to embed a complete voice assistant into all your devices. You only want to tell your Roomba to start vacuuming — no need to let you start a Spotify playlist from your vacuum cleaner.
This approach presents a few advantages when it comes to privacy and network effects. Big tech companies are creating ecosystem of internet-of-things devices. People are buying lightbulbs, security cameras and door locks that work with the Amazon Echo for instance.
But if you can talk to the devices themselves, you don’t need to hook up your devices with a central home speaker — the central hub disappears. If voice assistants are more than a fad, Snips is building some promising technology. And Snips could get some licensing revenue for each device that comes with its voice assistant.
Featured Image: Bryce Durbin/TechCrunch Readmore
This autonomous 3D scanner figures out where it needs to look
If you need to make a 3D model of an object, there are plenty of ways to do so, but most of them are only automated to the extent that they know how to spin in circles around that object and put together a mesh. This new system from Fraunhofer does it more intelligently, getting a basic idea of the object to be scanned and planning out what motions will let it do so efficiently and comprehensively.
It takes what can be a time-consuming step out of the process in which a scan is complete and the user has to inspect it, find where it falls short (an overhanging part occluding another, for instance, or an area of greater complexity that requires closer scrutiny) and customize a new scan to make up for these lacks. Alternatively, the scanner might already have to have a 3D model loaded in order to recognize what it’s looking at and know where to focus.
Fraunhofer’s project, led by Pedro Santos at the Institute for Computer Graphics Research, aims to get it right the first time by having the system evaluate its own imagery as it goes and plan its next move.
“The special thing about our system is that it scans components autonomously and in real time,” he said in a news release. It’s able to “measure any component, irrespective of its design — and you don’t have to teach it.”
This could help in creating one-off duplicates of parts the system has never seen before, like a custom-made lamp or container, or a replacement for a vintage car’s door or engine.
If you happen to be in Hanover in April, drop by Hannover Messe and try it out for yourself.
Featured Image: Fraunhofer Readmore
SignAll is slowly but surely building a sign language translation platform
Translating is difficult work, the more so the further two languages are from one another. French to Spanish? Not a problem. Ancient Greek to Esperanto? Considerably harder. But sign language is a unique case, and translating it uniquely difficult, because it is fundamentally different from spoken and written languages. All the same, SignAll has been working hard for years to make accurate, real-time machine translation of ASL a reality.
One would think that with all the advances in AI and computer vision happening right now, a problem as interesting and beneficial to solve as this would be under siege by the best of the best. Even thinking about it from a cynical market-expansion point of view, an Echo or TV that understands sign language could attract millions of new (and very thankful) customers.
Unfortunately, that doesn’t seem to be the case — which leaves it to small companies like Budapest-based SignAll to do the hard work that benefits this underserved group. And it turns out that translating sign language in real time is even more complicated than it sounds.
CEO Zsolt Robotka and chief R&D officer Marton Kajtar were exhibiting this year at CES, where I talked with them about the company, the challenges they were taking on, and how they expect the field to evolve. (I’m glad to see the company was also at Disrupt SF in 2016, though I missed them then.)
Perhaps the most interesting thing to me about the whole business is how interesting and complex the problem is that they are attempting to solve.
“It’s multi-channel communication; it’s really not just about shapes or hand movements,” explained Robotka. “If you really want to translate sign language, you need to track the entire upper body and facial expressions — that makes the computer vision part very challenging.”
Right off the bat that’s a difficult ask, since that’s a huge volume in which to track subtle movement. The setup right now uses a Kinect 2 more or less at center and three RGB cameras positioned a foot or two out. The system must reconfigure itself for each new user, since just as everyone speaks a bit differently, all ASL users sign differently.
“We need this complex configuration because then we can work around the lack of resolution, both time and spatial (i.e. refresh rate and number of pixels), by having different points of view,” said Kajtar. “You can have quite complex finger configurations, and the traditional methods of skeletonizing the hand don’t work because they occlude each other. So we’re using the side cameras to resolve occlusion.”
As if that wasn’t enough, facial expressions and slight variations in gestures also inform what is being said, for example adding emotion or indicating a direction. And then there’s the fact that sign language is fundamentally different from English or any other common spoken language. This isn’t transcription — it’s full-on translation.
“The nature of the language is continuous signing. That makes it hard to tell when one sign ends and another begins,” Robotka said. “But it’s also a very different language; you can’t translate word by word, recognizing them from a vocabulary.”
SignAll’s system works with complete sentences, not just individual words presented sequentially. A system that just takes down and translates one sign after another (limited versions of which exist) would be liable to creating misinterpretations or overly simplistic representations of what was said. While that might be fine for simple things like asking directions, real meaningful communication has layers of complexity that must be detected and accurately reproduced.
Somewhere in between those two options is what SignAll is targeting for its first public pilot of the system, at Gallaudet University. This Washington, D.C. school for the deaf is renovating its welcome center and SignAll will be installing a translation booth there so that hearing people can interact with deaf staff there.
It’s a good opportunity to test this, Robotka said, since usually the information deficit is the other way around: a deaf person who needs information from a hearing person. Visitors who can’t sign can speak, and the query can be turned to text (unless the staff member can read lips) and responded to with signs, which are then translated back into text or synthesized speech.
It sounds complicated, and in a technical way it is, but really neither person needs to do anything but communicate the way they normally do, and they can be understood by the other. When you think about it, that’s pretty amazing.
To prepare for the pilot, SignAll and Gallaudet worked together to create a database of signs specific to the application at hand or local to the university itself. There’s no comprehensive 3D representation of all signs, if that’s even possible, so for now the system will cater to the environment in which it is deployed, with domain-specific gestures being added on a rolling basis to a database.
“That was a huge effort, to collect the 3D data of all these signs. We just finished, with their support,” said Robotka. “We did interviews, collected some conversations that occurred there, to make sure we have all the language elements and signs. We expect to do that kind of customization work for the first couple of pilots.”
This long-running project is a sobering reminder of both the possibilities and limitations of technology. True, automatic translation of sign language is a goal only just becoming possible with advances in computer vision, machine learning, and imaging. But unlike many other translation or CV tasks, it requires a great deal of human input at every step, not just to achieve basic accuracy, but to ensure the humanitarian aspects are present as well.
After all, this isn’t just about the convenience of reading a foreign news article or communicating abroad, but of a class of people who are fundamentally excluded from what most people think of as in-person communication — speech. To improve their lot is worth waiting for.
Featured Image: SignAll Readmore
Move Guides acquires Polaris Global Mobility to expand services for expats and relocation
Relocation, relocation, relocation, as the saying (sort of) goes. On the heels of raising $48 million last year to tap into the growing needs of businesses to handle global workforces (a huge if sometimes controversial area of the job market), Move Guides today announced an acquisition to expand its footprint in the market and the services it offers to customers. The startup is buying up Polaris Global Mobility, which works with large enterprises to build software to manage ex-pat program management and payroll. Polaris works with large tech companies like Dell (and a number of others that prefer not to be named) among others in other sectors like industrial, financial and pharmaceutical.
Financial terms of the deal are not being disclosed, said Move Guides’ CEO and founder Brynne Kennedy in an interview. Polaris is bootstrapped, has been around for 16 years, and has “grown to significant, double-digit ARR since its founding,” she said.
Kennedy said the acquisition was made to extend Move Guides’ touchpoints with customers. It started out by going after legacy players in the area of relocation, such as Cartus, Brookfield and Aires. Move will now be able to offer more after-move services such as tax and payroll to those customers.
“We considered building our own expatriate management solutions,” she said, but decided that it made more sense to acquire Polaris and accelerate value for our customers. Our customers have been asking for this for a while, and we saw great synergies between the technologies and companies. The complexity of building a global tax engine and expatriate payroll functionality is high, and Polaris is best in class.”
Globalization is a major force in our world today. We usually hear about it in the context of how goods are made in one market and often procured by buyers somewhere completely different, or how newer services like the internet have “shrunk” the world to let news and other information travel at the speed of light, and the negative and positive consequences of each.
What’s less buzzy, but actually part and parcel of both of those developments, is the movement of people, who are sent to different markets to build these businesses and can do so because of explosion of networking and the many products built on it to improve communication (I am a direct beneficiary of that, here in London, working as a writer and news editor for TC, a San Francisco-based publication).
The fact that the movement of people and how they can subsequently be employed in their non-home has flashpoint topic in the US and in other countries like the UK — a testament to how big and potentially disruptive the space is and will continue to be in the future. Global mobility is forecast to be a $11 billion-$15 billion market by 2023, larger even than core human resources services, said Kennedy.
This is driving some interesting business opportunities for those who are looking to take leadership positions in this space, and in addition to traditional players like EY and Equus (which work together), as well as immigration data startups like Envoy Global, when you consider that companies like Salesforce and Microsoft, as well as startups like Zenefits are all also considering how to grow deeper into more back-office services for enterprises, you can see where a company like Move Guides (or Polaris, for that matter) might be an interesting target.
“We are actually seeing two things, consolidation and ‘SaaS-ification,’” Kennedy noted, “and we have reinforced our position with the addition of Polaris.”
Kennedy said Move Guides is not currently raising capital in the near term. It’s raised about $75 million to date and is valued at around $100 million, according to data from PitchBook. It’s not clear how that valuation is now changing with the addition of Polaris but it’s an obvious bid to position the combined company to take on a bigger role and bigger money in the future.
“MOVE Guides has established itself as an innovation leader,” said Bryan Williams, CEO of Polaris in a statement. “They deliver an exceptional experience to employees and very clearly share our commitment to digital innovation and service, which made this deal a natural choice to take our company to the next level.”
Nandhini1 day ago
Nandhini Serial – Episode 323 – 20th Feb
Serials7 hours ago
Sri Subramanya Charitam – Episode 9 | 21st Feb
Serials2 days ago
Shaneeshwaruni Divya Charitra Episode 166
Entertainment3 days ago
The 15:17 to Paris – Picking real-life heroes over actors lends authenticity to Clint Eastwood film
Serials1 day ago
Shaneeshwaruni Divya Charitra Episode 167
Entertainment21 hours ago
Sanjay Leela Bhansali finally opens up on Padmaavat controversy: After they burnt my Kolhapur set, I said enough is enough
Entertainment21 hours ago
Sanjay Leela Bhansali finally opens up on Padmaavat controversy: After they burnt my Kolhapur set, I said enough is enough
Entertainment1 day ago
Anushka Sharma, Virat Kohli’s cute Instagram post; Gaurav Chopra gets hitched: Social Media Stalkers’ Guide