xref: /llvm-project/llvm/docs/Telemetry.rst (revision dbae7176a6ecf558dc5e92016cdda387c9d74d66)
1===========================
2Telemetry framework in LLVM
3===========================
4
5.. contents::
6   :local:
7
8.. toctree::
9   :hidden:
10
11Objective
12=========
13
14Provides a common framework in LLVM for collecting various usage and performance
15metrics.
16It is located at ``llvm/Telemetry/Telemetry.h``.
17
18Characteristics
19---------------
20* Configurable and extensible by:
21
22  * Tools: any tool that wants to use Telemetry can extend and customize it.
23  * Vendors: Toolchain vendors can also provide custom implementation of the
24    library, which could either override or extend the given tool's upstream
25    implementation, to best fit their organization's usage and privacy models.
26  * End users of such tool can also configure Telemetry (as allowed by their
27    vendor).
28
29Important notes
30---------------
31
32* There is no concrete implementation of a Telemetry library in upstream LLVM.
33  We only provide the abstract API here. Any tool that wants telemetry will
34  implement one.
35
36  The rationale for this is that all the tools in LLVM are very different in
37  what they care about (what/where/when to instrument data). Hence, it might not
38  be practical to have a single implementation.
39  However, in the future, if we see enough common pattern, we can extract them
40  into a shared place. This is TBD - contributions are welcome.
41
42* No implementation of Telemetry in upstream LLVM shall store any of the
43  collected data due to privacy and security reasons:
44
45  * Different organizations have different privacy models:
46
47    * Which data is sensitive, which is not?
48    * Whether it is acceptable for instrumented data to be stored anywhere?
49      (to a local file, what not?)
50
51  * Data ownership and data collection consents are hard to accommodate from
52    LLVM developers' point of view:
53
54    * E.g., data collected by Telemetry is not necessarily owned by the user
55      of an LLVM tool with Telemetry enabled, hence the user's consent to data
56      collection is not meaningful. On the other hand, LLVM developers have no
57      reasonable ways to request consent from the "real" owners.
58
59
60High-level design
61=================
62
63Key components
64--------------
65
66The framework consists of four important classes:
67
68* ``llvm::telemetry::Manager``: The class responsible for collecting and
69  transmitting telemetry data. This is the main point of interaction between the
70  framework and any tool that wants to enable telemetry.
71* ``llvm::telemetry::TelemetryInfo``: Data courier
72* ``llvm::telemetry::Destination``: Data sink to which the Telemetry framework
73  sends data.
74  Its implementation is transparent to the framework.
75  It is up to the vendor to decide which pieces of data to forward and where
76  to forward them to for their final storage.
77* ``llvm::telemetry::Config``: Configurations for the ``Manager``.
78
79.. image:: llvm_telemetry_design.png
80
81How to implement and interact with the API
82------------------------------------------
83
84To use Telemetry in your tool, you need to provide a concrete implementation of the ``Manager`` class and ``Destination``.
85
861) Define a custom ``Serializer``, ``Manager``, ``Destination`` and optionally a subclass of ``TelemetryInfo``
87
88.. code-block:: c++
89
90  class JsonSerializer : public Serializer {
91  public:
92    json::Object *getOutputObject() { return Out.get(); }
93
94    Error init() override {
95      if (Started)
96        return createStringError("Serializer already in use");
97      started = true;
98      Out = std::make_unique<json::Object>();
99      return Error::success();
100    }
101
102    // Serialize the given value.
103    void write(StringRef KeyName, bool Value) override {
104      writeHelper(KeyName, Value);
105    }
106
107    void write(StringRef KeyName, int Value) override {
108      writeHelper(KeyName, Value);
109    }
110
111    void write(StringRef KeyName, long Value) override {
112      writeHelper(KeyName, Value);
113    }
114
115    void write(StringRef KeyName, long long Value ) override {
116      writeHelper(KeyName, Value);
117    }
118
119    void write(StringRef KeyName, unsigned int Value) override {
120      writeHelper(KeyName, Value);
121    }
122
123    void write(StringRef KeyName, unsigned long Value) override {
124      writeHelper(KeyName, Value);
125    }
126
127    void write(StringRef KeyName, unsigned long long Value) override {
128      writeHelper(KeyName, Value);
129    }
130
131    void write(StringRef KeyName, StringRef Value) override {
132      writeHelper(KeyName, Value);
133    }
134
135    void beginObject(StringRef KeyName) override {
136      Children.push_back(json::Object());
137      ChildrenNames.push_back(KeyName.str());
138    }
139
140    void endObject() override {
141      assert(!Children.empty() && !ChildrenNames.empty());
142      json::Value Val = json::Value(std::move(Children.back()));
143      std::string Name = ChildrenNames.back();
144
145      Children.pop_back();
146      ChildrenNames.pop_back();
147      writeHelper(Name, std::move(Val));
148    }
149
150    Error finalize() override {
151      if (!Started)
152        return createStringError("Serializer not currently in use");
153      Started = false;
154      return Error::success();
155    }
156
157  private:
158    template <typename T> void writeHelper(StringRef Name, T Value) {
159      assert(Started && "serializer not started");
160      if (Children.empty())
161        Out->try_emplace(Name, Value);
162      else
163        Children.back().try_emplace(Name, Value);
164    }
165    bool Started = false;
166    std::unique_ptr<json::Object> Out;
167    std::vector<json::Object> Children;
168    std::vector<std::string> ChildrenNames;
169  };
170
171  class MyManager : public telemery::Manager {
172  public:
173  static std::unique_ptr<MyManager> createInstatnce(telemetry::Config *Config) {
174    // If Telemetry is not enabled, then just return null;
175    if (!Config->EnableTelemetry)
176      return nullptr;
177    return std::make_unique<MyManager>();
178  }
179  MyManager() = default;
180
181  Error preDispatch(TelemetryInfo *Entry) override {
182    Entry->SessionId = SessionId;
183    return Error::success();
184  }
185
186  // You can also define additional instrumentation points.
187  void logStartup(TelemetryInfo *Entry) {
188    // Add some additional data to entry.
189    Entry->Msg = "Some message";
190    dispatch(Entry);
191  }
192
193  void logAdditionalPoint(TelemetryInfo *Entry) {
194    // .... code here
195  }
196
197  private:
198    const std::string SessionId;
199  };
200
201  class MyDestination : public telemetry::Destination {
202  public:
203    Error receiveEntry(const TelemetryInfo *Entry) override {
204      if (Error Err = Serializer.init())
205        return Err;
206
207      Entry->serialize(Serializer);
208      if (Error Err = Serializer.finalize())
209        return Err;
210
211      json::Object Copied = *Serializer.getOutputObject();
212      // Send the `Copied` object to wherever.
213      return Error::success();
214    }
215
216  private:
217    JsonSerializer Serializer;
218  };
219
220  // This defines a custom TelemetryInfo that has an additional Msg field.
221  struct MyTelemetryInfo : public telemetry::TelemetryInfo {
222    std::string Msg;
223
224    Error serialize(Serializer &Serializer) const override {
225      TelemetryInfo::serialize(serializer);
226      Serializer.writeString("MyMsg", Msg);
227    }
228
229    // Note: implement getKind() and classof() to support dyn_cast operations.
230  };
231
232
2332) Use the library in your tool.
234
235Logging the tool init-process:
236
237.. code-block:: c++
238
239  // In tool's initialization code.
240  auto StartTime = std::chrono::time_point<std::chrono::steady_clock>::now();
241  telemetry::Config MyConfig = makeConfig(); // Build up the appropriate Config struct here.
242  auto Manager = MyManager::createInstance(&MyConfig);
243
244
245  // Any other tool's init code can go here.
246  // ...
247
248  // Finally, take a snapshot of the time now so we know how long it took the
249  // init process to finish.
250  auto EndTime = std::chrono::time_point<std::chrono::steady_clock>::now();
251  MyTelemetryInfo Entry;
252
253  Entry.Start = StartTime;
254  Entry.End = EndTime;
255  Manager->logStartup(&Entry);
256
257Similar code can be used for logging the tool's exit.
258