Date of Award

Spring 1-1-2013

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Timothy X Brown

Second Advisor

Eric W. Frew

Third Advisor

Lijun Chen

Fourth Advisor

Nikolaus Correll

Fifth Advisor

Michael C. Mozer


Given multiple widespread stationary data sources (nodes), an unmanned aircraft (UA) can fly over the sensors and gather the data via a wireless link. This is known as data ferrying or data muling, and finds application in surveillance and scientific monitoring of remote and inaccessible regions. Desiderata for such a network include competing objectives related to latency, bandwidth, power consumption by the nodes, and tolerance for imperfect environmental information. For any design objective, network performance depends upon the control policies of UA and nodes. A model of such a system permits optimal planning, but is difficult to acquire and maintain. Node locations may not be precisely known. Radio fields are directional and irregular, affected by antenna shape, occlusions, reflections, diffraction, and fading. Complex aircraft dynamics further hamper planning. The conventional approach is to plan trajectories using approximate models, but inaccuracies in the models degrades the quality of the solution. In order to provide an alternative to the process of building and maintaining detailed environmental and system models, we present a model-free learning framework for trajectory optimisation and control of node radio transmission power in UA-ferried sensor networks. We introduce policy representations that are easy both for learning algorithms to manipulate and for off-the-shelf autopilots and radios to work with. We show that the policies can be optimised through direct experience with the environment. To speed and stabilise the policy learning process, we introduce a metapolicy that learns through experience with past scenarios, transferring knowledge to new problems. Algorithms are tested using two radio propagation simulators, both of which produce irregular radio fields not commonly studied in the data-ferrying literature. The first introduces directional antennas and point noise sources. The second additionally includes interaction with terrain. Under the simpler radio simulator, the proposed algorithms generally perform within ~15% of optimal performance after a few dozen trials. Environments produced by the terrain-based simulator are more challenging, with learners generally approaching to within ~40% of optimal performance in similar time. We show that under either simulator even small modelling errors can reduce the optimal planner's performance below that of the proposed learning approach.