Use the algorithm described in the hint. That is, consider an algorithm with $\log n$ phases, where you have each processor who, in phase $i$, thinks it might be the leader, send out a ``probe'' message in each direction that goes $2^i$ hops and comes back if it finds no processor with lower id value. During the processing of a phase, a processor performs the leader-election algorithm as usual, except that each message now contains a hop count that is initialized at $2^i$ and is decremented each time a message crosses a processor. When the hop count goes to $0$, then the processor handling it, sends it back in the direction it came from (rather than sending it on). Since only processors that think they might be leaders are the ones sending messages, the number of processors sending messages in round $i$ is no more than $n/2^i$ (starting the numbering at $0$). Since the probe message goes at most $O(2^i)$ hops in each case, there are $O(n)$ messages sent in each round. Therefore, given that there are $O(\log n)$ rounds, the total message complexity is $O(n\log n)$.