Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.
— Kunal
To save some time, I'm going to speed read the slides / reading first, and then watch videos if it makes sense to me.
Videos:
each model is basically
linear hypothesis function uses a linear operator == matmul for transformation
classification error
softmax / cross entropy loss
numerical differentiation -- approximate the gradient,
Automatic Differentiation
good way to check auto differentiation
partial adjoints for handling multiple pathways