Please answer without looking at Wikipedia or otherwise finding the answer. Please answer, even if the answer is "no". I'm trying to get a measure of the extent of a bit of knowledge, here:
How many of you know, off the top of your head, that the defining characteristic of a linear system is superposition?
That if a system obeys superposition it must be linear?
The difference between a time-varying and a non-linear system?
Why the fact that a system obeys superposition vastly eases the task of analyzing its dynamic behavior?
------------
From memory, a learn system is not simply that of superposition. A linear function, of which a system is represented mathematically, must poses the linearity properties. That if addition, scalar multiplication, homogeneity, and one more that I can't recall(since I was forbidden to look at wiki you'll have to do it for me ;). Basically,
A linear system essentially has a well defined Fourier transform, causality, superposition, etc... But there are superposition itself does not make a linear system.
Essentially a if you add two linear systems together you get another linear system(that is the idea of superposition). But there are systems that when added together give linear systems and non-linear systems.
In any case LTI systems are the most simple types of systems and generally are equivalent to solving systems of linear equations(ultimately). Hence it is most natural that these would bet he first types of systems studied and the most machinery developed for them. Much of mathematics deals with these types of systems.
Most non-linear systems are intractable. Not only do they have no way to algebraically solve them their numerical solutions are unstable.
Obviously if you can break a system into components, study the individual components, and easily "reassemble" the system you have a drastically reduced the complexity. This is why "superposition" is important. Again, it's more important the concept of linearity and because linear systems "add".
Another way of looking at it is if one knows the response to the impulse function of a linear system one knows how the function will respond to any function since any function can be written as a convolution/integral/sum of the impulse functions. Since the system is linear, the operations commute:
let S(.) be the system and f(t) be the input S(f(t)) = S(int(f(t)*dirac(t))) = int(f(t)*S(dirac(t)))
So if S(dirac(t)) is known or easily obtained(which it almost always is) then the "response"(or what we know as the transfer function is very easily obtained. For non-linear systems S(.) cannot commute and we can't make such implications.
But note there is a similar way to analyze non-linear systems but instead of using the dirac function we use white noise. In this case the decomposition is much more complex but supposedly it is possible(mathematically). There simply is not enough machinery/intelligence to make analyzing such systems productive.
One other point that makes non-linear systems so complex is that a very slight change in the structure can result in drastically different outcomes. This is not true for linear systems. Small perturbations result in small perturbations. Of course there are non-linear systems that are approximately linear or can be linearly approximated(for example, the simple pendulum).
Most electrical components are approximately linear and therefor when combining such components we get an approximately linear system. When a non-linear component is introduced we generally approximately it as a piecewise linear system(transistors "regions", etc...).
In some sense linearity is all that most people can comprehend.