See also Gradient Boost!!!
We start with a very small tree (just a single leaf). We calculate residuals. Then we try not to predict the target variable, but the residuals of the previous tree. Each tree is scaled by the learning rate. With each tree the residuals become smaller (link). Vincent explains it with more Code here:

Checkout this video to understand ‣!!!
See AdaBoost
AdaBoost starts by making a stump (link). It uses an iterative approach to learn from the mistakes of weak classifiers. AdaBoost scales the trees: The larger the stump, the better the performance of this stump.

AdaBoost continues to make stumps until it has made the number of stumps you asked for, or it has a perfect fit.
Starts by making a single leaf, instead of a tree or stump (link). This leaf represents an initial guess (e.g. the average) for the Weights (see dataset) of all the samples. Then Gradient Boost builds a tree based on the errors.

We start with building a first tree (which is just a single leaf). We calculate the residuals of this tree.

The next thing we do is build a tree based on the errors (residuals) from the first tree. The trick is that we try to predict the residuals instead of the original weights (link)! The tree for this is a bit longer then the one we started with. But its still very simple, as you can see in the video.