【问题标题】:Understanding ‘backward()’: How to code the Pytorch function ‘.backward()’ from scratch?理解‘backward()’:如何从头开始编写 Pytorch 函数‘.backward()’?
【发布时间】:2021-01-06 23:42:56
【问题描述】:

我是一个学习深度学习的新手,我一直试图理解 Pytorch 中的“.backward()”是做什么的,因为它在那里完成了大部分工作。因此,我试图详细了解后向函数的作用,因此,我将尝试逐步编写函数的作用。您可以向我推荐任何资源(书籍、视频、GitHub 存储库)来开始编写该函数吗?感谢您抽出宝贵时间,希望您能提供帮助。

【问题讨论】:

  • 也许可以试试youtube.com/…。播放列表中的前两个视频也很棒(整个频道也是如此),但涵盖了您可能已经熟悉的基础知识。

标签: python deep-learning pytorch


【解决方案1】:

backward() 正在计算相对于(w.r.t.)图叶的梯度。 grad() 函数更通用,它可以计算梯度 w.r.t。任何输入(包括叶子)。

我实现了grad()函数,前段时间,你可以看看这个。它利用了自动微分 (AD) 的力量。

import math
class ADNumber:
    
    def __init__(self,val, name=""):
        self.name=name
        self._val=val
        self._children=[]         
        
    def __truediv__(self,other):
        new = ADNumber(self._val / other._val, name=f"{self.name}/{other.name}")
        self._children.append((1.0/other._val,new))
        other._children.append((-self._val/other._val**2,new)) # first derivation of 1/x is -1/x^2
        return new

    def __mul__(self,other):
        new = ADNumber(self._val*other._val, name=f"{self.name}*{other.name}")
        self._children.append((other._val,new))
        other._children.append((self._val,new))
        return new

    def __add__(self,other):
        if isinstance(other, (int, float)):
            other = ADNumber(other, str(other))
        new = ADNumber(self._val+other._val, name=f"{self.name}+{other.name}")
        self._children.append((1.0,new))
        other._children.append((1.0,new))
        return new

    def __sub__(self,other):
        new = ADNumber(self._val-other._val, name=f"{self.name}-{other.name}")
        self._children.append((1.0,new))
        other._children.append((-1.0,new))
        return new
    
            
    @staticmethod
    def exp(self):
        new = ADNumber(math.exp(self._val), name=f"exp({self.name})")
        self._children.append((self._val,new))
        return new

    @staticmethod
    def sin(self):
        new = ADNumber(math.sin(self._val), name=f"sin({self.name})")      
        self._children.append((math.cos(self._val),new)) # first derivative is cos
        return new
    
    def grad(self,other):
        if self==other:            
            return 1.0
        else:
            result=0.0
            for child in other._children:                 
                result+=child[0]*self.grad(child[1])                
            return result
        
A = ADNumber # shortcuts
sin = A.sin
exp = A.exp

def print_childs(f, wrt): # with respect to
    for e in f._children:
        print("child:", wrt, "->" , e[1].name, "grad: ", e[0])
        print_child(e[1], e[1].name)
        
    
x1 = A(1.5, name="x1")
x2 = A(0.5, name="x2")
f=(sin(x2)+1)/(x2+exp(x1))+x1*x2

print_childs(x2,"x2")
print("\ncalculated gradient for the function f with respect to x2:", f.grad(x2))

输出:

child: x2 -> sin(x2) grad:  0.8775825618903728
child: sin(x2) -> sin(x2)+1 grad:  1.0
child: sin(x2)+1 -> sin(x2)+1/x2+exp(x1) grad:  0.20073512936690338
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0
child: x2 -> x2+exp(x1) grad:  1.0
child: x2+exp(x1) -> sin(x2)+1/x2+exp(x1) grad:  -0.05961284871202578
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0
child: x2 -> x1*x2 grad:  1.5
child: x1*x2 -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0

calculated gradient for the function f with respect to x2: 1.6165488003791766

【讨论】:

  • 感谢您分享您的实施。
猜你喜欢
  • 2019-06-06
  • 2020-10-13
  • 2017-09-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-02-04
相关资源
最近更新 更多