"It took multiple attempts to get the connection string right." "if the broker returns the error code LEADER_NOT_AVAILABLE, the producer can try sending the error again" The paragraph in the previous page should explain also hopping window. seekToBeginning(Collection tp) and seekToEnd(Collection tp)? mysql> commit; Typo: Last sentence in paragraph after code example: At this point in the book pause functionality has not yet been discussed. "{\" namespace\" .... " + KafkaConsumer consumer = new - Unclear how the tool knows how to deserialize the raw bytes. Since the result ("FileStream-Source") indeed doesn't work, I suggest breaking the line after the preceding column (i.e. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Both consumer groups will get all messages. And how to move all of this data becomes nearly as important as the data itself. getFaxNumber(). realtime should be relatime. return numPartitions; Additionally, that can be simplified with `getOrDefault()` method: Is this a shorthand for "min.insync.replicas"? "The committed offset should always be the offset of the next message that your application will read." Besides it should have type int (not String). Note from the Author or Editor:Good point. - explain how to send "end of file" (for beginners): CRTL+D, Note from the Author or Editor:Serialization - it's explained on p202 at the start of the section that it outputs raw bytes by default. Other errata mentions this also appearing in "section 710" of the kindle edition but I'm confirming here that it's in the pdf on page 30. Chapter 4. What is `payload` -- key? } "After tasks are initialized, the are started..." should be "After tasks are initialized, they are started...". 1. -> "@Override" should be after JavaDoc comment ? iirc it is not introduced until chapter six. In This is correct. Should the configuration property be "tasks.max" instead? That is a typo. Perhaps changing to "Here, the key will simply be set to null." The "Fraud Detection" list item title is formatted differently than the other two ("Customer Service" and "Internet of Things") appearing on the previous page. If offset 5000 was consumed, one should commit offset 5001. This will be resolved in the next update, Note from the Author or Editor:Indeed. the '--bootstrap-server' parameter given in the example command to list new consumer groups is wrong. last sentence: "... for produce[r] or consume[r] clients." "This is done by setting `key.ignore` configuration to `true`" You submitted the following rating and review. I have added a note that the class name is different in 0.11. ""returned a non-retriable error", "Gzip compression will typically use more CPU but result[s] in better compression ratios,..." -> "// nothing to configure" should be indented by two spaces (not just one) ? Note from the Author or Editor:The example uses single quotes, which does not require escaping the dollar sign. This error repeats twice. It's unclear how this is done? "--consumer.config CONFIGFILE, where _CONFIGFILE_ is the full ..." We should remove the sentence "It took multiple attempts to get the connection string right. thank you for the fantastic work on this book. "...it can take a few month to get right" should be "...it can take a few month to get everything right", In the paragraph logical topic "user" is defined to be composed of "SF.users" and "NYC.users" but is later referred as ".users" (with a leading period) in Hard to see how twice as large is derived, Perhaps just say "much larger". "Setting `auto.offset.reset` to `none` will cause an exception to be thrown when attempting to consume from invalid offset." There was a "pause" explanation, which we removed (new versions of the consumer made "pause" less useful and we decided to leave it out). The deseralizers are correct in the recent version, but the reader is correct that a property is missing. two missing "s" as highlighted. is unclear. "This can be done by creating a JSON object with the format used in the execution step of partition assignment that adds or removes replicas to set the replication factor." Note from the Author or Editor:This comment is correct. This is in the kindle edition of the book -- "location 710", not page 710. Should be: - Explain Auto-Offset behavior. Bad line wrapping. "Consumers will need to consume events from .users if they wish to consume...". - first 6 line are indented to widely The example also seems to work on a 2 node cluster thus the new assignment is the same as the old. *")); Code formatting inconsistent with other example: - What if those brokers don't exist? name = new String(nameBytes, 'UTF-8'); "In the diagram, the real lag is 2, but the Later it talks about the timeout parameter passed into poll() itself -- this was already covered by bullet point #2 and does not belong to #3. -> "return customerName;" should be indented by two spaces (not just one) ? new OffsetAndMetadata(record.offset()+1) ; The code for "String schemaString" variable has bad line wrappings and is hard to read. if (custCountryMap.countainsValue(record.value())) { int wait = 500; "It has enough messages to fill a batch based on the max.partition.bytes configuration" If I go to the Kafka docs there is a bold note: I think it got chopped by the PDF generator. "Consumers take a significant performance hit when connecting to Kafka with SSL encryption---much more so then producers." The message data is raw bytes, without keys. "...resulting in one partition being much larger than the rest. In Note from the Author or Editor:Right. Note from the Author or Editor:Correct. Using customer.getName() would align the code with the schema. "Topic __conusmer_offsets", Note from the Author or Editor:The labels should be, from top to bottom: There is also a switching of deserializer/serializer in the line before the second set of example code): Note from the Author or Editor:Good point. Does not explain what "ideal leader" is. - owner: is this the same as `client.id`? Should have been: In Learn how Kafka works, internal architecture, what it's used for, and how to take full advantage of Kafka stream processing technology. In a perfectly balanced cluster, the numbers will be even across all brokers in the cluster, as in Table 10-2." "partition = %s" Patterns of Enterprise Application Architecture, The Robert C. Martin Clean Code Collection (Collection), Design Patterns: Elements of Reusable Object-Oriented Software, Agile Principles, Patterns, and Practices in C#, The Pragmatic Programmer: From Journeyman to Master, The Mythical Man-Month, Anniversary Edition: Essays On Software Engineering, Growing Object-Oriented Software, Guided by Tests. This is because use of SSL requires copying data for encryption, which means consumers no longer enjoy the performance benefits of the usual zero-copy optimization". http the definitive guide definitive guides Sep 06, 2020 Posted By Roger Hargreaves Media Publishing TEXT ID 64346938 Online PDF Ebook Epub Library the trade in a concise and readable mannerin addition to explaining the basic http features syntax and guidelines this book clarifies related but often misunderstood topics and The language has been clarified to note this in both sections. customer.put("id", nCustomers); Converted file can differ from the original. byte[] nameBytes = new byte[nameSize]; The "throw new SerializationException" text is: Thus So: Plural indented or should it be "content" ? if (e != null) 1. {"name": "name", "type": "string""}, It's also unclear how the tool computers the recommendation assignment. In callout #1, "AvroSerializer" should be "KafkaAvroSerializer". Such applications need to maintain state within the application because each event can be handled independently. -> JavaDoc comment should be formatted differently, each line starting with "*" ? we used the Gson library from Google to generate a JSon serializer and deserializer from our Java object. The statement Is it intended to mean any of commitSync() and commitAsync()? There is no path after the host:port and should be the following Second example magically print the message. Should it be Second example "class CustomSerializer" Why? Topic B, Partition 0 ", Note from the Author or Editor:The wording has been changed to "configuration parameters". My assumption is that those X's were intended to be replaced with numeric values of some sort :), Note from the Author or Editor:Thanks for the note! Both? key.converter.schema.enable hasn’t committed offsets for more recent messages yet. Kafka Consumers: Reading Data from Kafka. "--consumer-property KEY=VALUE, where _KEY_ is the configuration option name and _VALUE_ is the value ..." The two should be consistent. Second headline says: "Confluent's Replicator" consumer.wakeup().". Examples show output from the executed commands. "returned a nonretriable exceptions" "partition = %d". Should it be All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. Why is commit necessary? Some more details would be nice. Customer customer = CustomerGenerator.getNext(); The ovals in the bottom right of the figure should be labeled IOThread (not Processor Thread). and You've successfully reported this review. "if (data.length < 8)" It may takes up to 1-5 minutes before you received it. It is not clear, how the AVRO serializer can handle class Customer or where Customer class come from. KafkaConsumer consumer = new KafkaConsumer<>(props); I think they should be (from top to bottom): most, one consumer reading from a partition, so if you know should be changed to Get 1 credit every month to exchange for an audiobook of your choice, Real-Time Data and Stream Processing at Scale. The line: Example: `"replicas": [1,2]` A variable "kafkaProps" is defined as a local variable with modifier "private". This means that the above command will mark a topic for deletion, but the deletion will not happen immediately. "The reading application will contain calls to methods similar to getName(), getId(), and getFaxNumber." kafka-consumer-groups tool will report a lag of 5 because MirrorMaker In the output format string The callout explanation should be: SerializationException("Error when serializing " + "Customer to byte[] " + e); "Consumers take a significant performance hit when connecting to Kafka with SSL encryption---much more so then producers. Or does MySQL behave differently than I expect? "The entire data pipeline should standardize on a single time zones;" Instead of "Now the only problem is if the record is stored in a database and not in kafka" page 85 should say "Now the only problem is if the offset is stored in a database and not in kafka". The official notation is also "JSON". The configuration parameter should be min.insync.replicas. - no newline for "for-loop"; curly brace should be in same line and Should be: "if the broker returns the error code LEADER_NOT_AVAILABLE, the producer can try sending the error again" It's unclear why SSL should be more expensive for consumers than producers.