ai人工智能的本质和未来
The future looks towards running deep learning algorithms on more compact devices as any improvements in this space make for big leaps in the usability of AI.
未来的趋势是在更紧凑的设备上运行深度学习算法,因为该领域的任何改进都将使AI的可用性取得重大飞跃。
If a Raspberry Pi could run large neural networks, then artificial intelligence could be deployed in a lot more places.
如果Raspberry Pi可以运行大型神经网络,那么人工智能可以部署在更多地方。
Recent research in the field of economising AI has led to a surprisingly easy solution to reduce the size of large neural networks. It’s so simple, it could fit in a tweet:
在节省AI领域中的最新研究已导致出乎意料的简单解决方案,以减小大型神经网络的大小。 它非常简单,可以在一条推文中显示 :
- Train the Neural Network to Completion
训练神经网络完成
- Globally prune the 20% of weights with the lowest magnitudes.
在全球范围内修剪最低重量的20%重量。
- Retrain with learning rate rewinding for the original training time.
以原始训练时间的学习率倒带进行再训练。
- Iteratively repeat steps 2 and 3 until the desired sparsity is reached.
反复重复步骤2和3,直到达到所需的稀疏度。
Further, if you keep repeating this procedure, you can get the model as tiny as you want. However, it’s pretty certain that you’ll lose some model accuracy along the way.
此外,如果继续重复此过程,则可以根据需要获得最小的模型。 但是,可以肯定的是,您将在此过程中损失一些模型精度。
This line of research grew out of the an ICLR paper last year (Frankle and Carbin’s Lottery Ticket Hypothesis) which showed that a DNN could perform with only 1/10th of the number of connections if the right subnetwork was found in training.
这项研究源于去年的ICLR论文(Frankle和Carbin的彩票假设 ),该论文表明,如果在训练中找到正确的子网,则DNN只能执行连接数量的1/10的操作。
The timing of this finding coincides well with reaching new limitations in computational requirements. Yes, you can send a model to train on the cloud but for seriously big networks, along with considerations of training time, infrastructure and energy usage — more efficient methods are desired because they’re just easier to handle and manage.
这一发现的时机恰好与在计算要求上达到新的限制相吻合。 是的,您可以发送模型在云上进行训练,但对于大型网络,需要考虑训练时间,基础架构和能源使用情况,因此需要更高效的方法,因为它们更易于操作和管理。
Bigger AI models are more difficult to train and to use, so smaller models are preferred.
较大的AI模型更难训练和使用,因此较小的模型是首选。
Following this desire for compression, pruning algorithms came back into the picture following the success of the ImageNet competition. Higher performing models were getting bigger and bigger but many researchers proposed techniques try keep them smaller.
随着对压缩的渴望,随着ImageNet竞赛的成功,修剪算法重新出现 。 性能更高的模型变得越来越大,但是许多研究人员提出了一些技术,试图将它们缩小。
Yuhan Du on 玉函杜上UnsplashUnsplash
Song Han of MIT, developed a pruning algorithm for neural networks called AMC (AutoML for model compression) which removed redundant neurons and connections, when then the model is retrained to retain its initial accuracy level. Frankle took this method and developed it further by rewinding the pruned model to its initial weights and retrained it at a faster initial rate. Finally, in the ICLR study above, the researchers found that the model could be rewound to its early training rate and without playing with any parameters or weights.
麻省理工学院的宋瀚 ( Song Han)开发了一种称为AMC( 用于模型压缩的AutoML )的神经网络修剪算法,该算法删除了多余的神经元和连接,然后对其进行了重新训练以保持其初始精度水平。 Frankle采用了这种方法,并通过将修剪后的模型重绕到其初始权重并以更快的初始速率对其进行了重新训练来进一步开发了该方法。 最后,在上述ICLR研究中,研究人员发现该模型可以倒退至其早期训练速度,而无需使用任何参数或权重。
Generally as the model gets smaller, the accuracy gets worse however this proposed model performs better than both Han’s AMC and Frankle’s rewinding method.
通常,随着模型变小,精度会变差,但是此提议的模型的性能优于Han的AMC和Frankle的倒带方法。
Now it’s unclear why this model works as well as it does, but the simplicity of it is easy to implement and also doesn’t require time-consuming tuning. Frankle says: “It’s clear, generic, and drop-dead simple.”
现在还不清楚为什么该模型能够像它一样运作良好,但是它的简单性易于实现,并且不需要费时的调整。 弗兰克(Frankle)说:“这很清楚,通用并且很简单。”
Model compression and the concept of economising machine learning algorithms is an important field that we can make further gains in. Leaving models too large reduces the applicability and usability of them (I mean, you can keep your algorithm sitting in an API in the cloud) but there are so many constraints in keeping them local.
模型压缩和节省机器学习算法的概念是我们可以进一步获益的重要领域。模型过大会降低模型的适用性和可用性(我的意思是,您可以将算法保留在云中的API中)但是将它们保持在本地存在很多限制。
For most industries, models are often limited in their usability because they may be too big or too opaque. The ability to discern why a model works so well will not only enhance the ability to make better models, but also more efficient models.
对于大多数行业来说,模型的可用性通常受到限制,因为模型可能太大或太不透明。 辨别模型为何运作良好的能力不仅可以增强制作更好模型的能力,而且可以提高效率。
For neural nets, the models are so big because you want the model to naturally develop connections, which are being driven by the data. It’s hard for a Human to understand these connections but regardless, the understanding the model can chop out useless connections.
对于神经网络,模型是如此之大,因为您希望模型自然地建立由数据驱动的连接。 对于人类而言,很难理解这些连接,但是无论如何,对模型的理解都可以消除无用的连接。
The golden nugget would be to have a model that can reason — so a neural network which trains connections based on logic, thereby reducing the training time and final model size, however, we’re some time away from having an AI that controls the training of AI.
金块将是拥有一个可以推理的模型-因此,一个基于逻辑来训练连接的神经网络,从而减少了训练时间和最终模型的大小,但是,我们距离控制训练的AI还有一段距离AI。
Thanks for reading, and please let me know if you have any questions!
感谢您的阅读,如果您有任何疑问,请告诉我!
Keep up to date with my latest articles here!
在这里了解我的最新文章!
翻译自: https://towardsdatascience.com/the-future-of-ai-is-in-model-compression-145158df5d5e
ai人工智能的本质和未来