Detect loop in the DNS setup #37

s-allius · 2024-03-10T17:15:08Z

If the DNS name logger.talent-monitoring.com is resolved to itself by the proxy, then countless connections are established through this loop until the system crashes.
This is a misconfiguration of the DNS setup and should be recognized by the proxy and the outgoing connection should not be established. The proxy then runs and at least delivers the data to the MQTT broker.

Cardes · 2024-06-16T18:35:00Z

Do you already have an idea how to figure out if the dns entry resolves to the proxy itself?
I assume a dns resolution check would be only be feasible if comparing external/internal dns server responses and users might want to opt out using an external / fixed dns server.
Another option would be to call one of its own endpoints (i.e. /-/healthy) on the external dns- if it gets an answer that would be a good indication that we have the bogus condition but would also take longer till the timeout at the cloud endpoint is reached.

Do you see any other/ better approach?

s-allius · 2024-06-16T20:30:53Z

It is really not simple. Until the user can have firewall rules, NAT, Hair-Pin-NAT etc, I would prefer to send a real packet to the well known port 5005 and 10000. So the question is how to mark this packet?

wrong coding -> this can be dangerous, cause the inverter or the TSUN cloud might crash
a special msg type or register -> also dangerous
for GEN3PLUS we can use a different/wrong CRC algorithm, but this will not work for GEN3
for GEN3 we can set a magic email address in the first packet, which we can easily detect (e.g. @test.com according to RFC2606)

So version 3 and 4 are my vavorits.

Your approach of using the HTTP endpoint is very good as we are independent of the inverter protocols. I'm not sure if a standard linux will send a packet for his own IP address to the gw or if it return inside the kernel. In the second case, the test package cannot be discarded externally, which would be great.
But the IP of the docker engine must not the IP for the DNS resolving. I use Hair-Pin-Nat, so I resolve the logger.talent-monitoring.com to the local IP 172.16.30.x which will be forwarded to the docker engine with the IP 172.16.20.y
So this might not work in every case.

Cardes · 2024-06-17T14:58:16Z

Option 1

Lets assume we do a request to logger.talent-monitoring:8127/-/ready
In case we have a bogus and nothing prevents the connection we get a "Is ready".
In case we get a timeout we don't know if its the real one or something blocking the request (firewall)

- Upside: no harm to tsun-cloud as the port is blocked
- Downside: not all circumstances can be detected

Option 2

Lets assume we send a packet (option 3 or 4 or checksum i/o) to the cloud endpoint on either port 5000 or 10000
In case we have a bogus the proxy recognizes the crc, mail or checksum (more on that idea later, no idea if its feasible)
In case we don't have a bogus the tsun clouds needs do digest the wrong packet and we need to hope nothing breaks there

- Upside: highest confidence in bogus detection as this should work for all cases
- Downside: Might break something on the cloud end

Idea: Checksum Workflow

Proxy receives first packet from the inverter
Proxy creates a checksum of the packet
Proxy tries to send this packet to the cloud
Proxy checks if packets 2-n received "from the inverter" have the same checksum and assumes bogus situation if true
Upside: no harm to tsun-cloud
Downside: not sure if there are identical packets coming in under normal situations

s-allius · 2024-06-17T16:36:03Z

If we have a loop, we will get a lot of new connections from the same endpoint with the same inverter serial number. That can be normal, if the inverter have detected a problem and establishes a new connection to solve this. I think that in this case the time between the two connection should be more than a seconds (Must be a timeout...)
If the time between the connection is very small (a few ms) than it might be a loop. The first outgoing connection from the proxy may need a DNS resolving. After this is done, the loop will be very fast.

Detection:

there a at least 3 connection from inverters with the same serial number
and, the time between the second and third connection establishment is very short

What do you think about this approach?

s-allius added the enhancement New feature or request label Mar 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect loop in the DNS setup #37

Detect loop in the DNS setup #37

s-allius commented Mar 10, 2024

Cardes commented Jun 16, 2024

s-allius commented Jun 16, 2024

Cardes commented Jun 17, 2024

s-allius commented Jun 17, 2024

Detect loop in the DNS setup #37

Detect loop in the DNS setup #37

Comments

s-allius commented Mar 10, 2024

Cardes commented Jun 16, 2024

s-allius commented Jun 16, 2024

Cardes commented Jun 17, 2024

Option 1

Option 2

Idea: Checksum Workflow

s-allius commented Jun 17, 2024